The purpose of this notebook is to develop a supervised Machine Learning model to that can predict which employees is likely to leave the company. Through predictive modeling and feature engineering, the ultimate goal of the HR department is to understand what factors lead to employees’ departure and to reduce the rate of attrition. The original problem statement required a logistic regression model, and therfore I will first focus on this algorithm. Then we will explore other models and compare their performances. The dataset was cleaned in the notebook Data Wrangling Notebook and was visualized in the Data Exploration Notebook.
library(ggplot2)
library(repr)
library(caret)
## Loading required package: lattice
library(ROCR)
library(pROC)
## Type 'citation("pROC")' for a citation.
##
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
##
## cov, smooth, var
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ tibble 3.0.1 ✓ dplyr 1.0.0
## ✓ tidyr 1.0.3 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ✓ purrr 0.3.4
## ── Conflicts ────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## x purrr::lift() masks caret::lift()
library(magrittr)
##
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
##
## set_names
## The following object is masked from 'package:tidyr':
##
## extract
library(randomForest)
## randomForest 4.6-14
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:dplyr':
##
## combine
## The following object is masked from 'package:ggplot2':
##
## margin
library(nnet)
options(repr.plot.width=4, repr.plot.height=4) # Set the initial plot area dimensions
dt <- read.csv("FinalData.csv")
# Remove the first column created by exporting/loading the csv file.
dt %<>% select(-1)
# Convert ordered categorical columns to factors.
dt$Education <- ordered(dt$Education,
levels = c("Below College", "College", "Bachelor", "Master", "Doctor"))
dt$BusinessTravel <- ordered(dt$BusinessTravel,
levels = c("Non-Travel", "Travel-Rarely", "Travel-Frequently"))
dt$JobLevel <- ordered(dt$JobLevel,
levels = c(1, 2, 3, 4, 5),
labels = c("1", "2", "3", "4", "5"))
dt$StockOptionLevel <- ordered(dt$StockOptionLevel,
levels = c(0, 1, 2, 3),
labels = c("0", "1", "2", "3"))
dt$EnvironmentSatisfaction <- ordered(dt$EnvironmentSatisfaction,
levels = c("N/A","Low", "Medium", "High", "Very High"))
dt$JobInvolvement <- ordered(dt$JobInvolvement,
levels = c("Low", "Medium", "High", "Very High"))
dt$JobSatisfaction <- ordered(dt$JobSatisfaction,
levels = c("N/A","Low", "Medium", "High", "Very High"))
dt$PerformanceRating <- ordered(dt$PerformanceRating,
levels = c("Low", "Good", "Excellent", "Outstanding"))
dt$WorkLifeBalance <- ordered(dt$WorkLifeBalance,
levels = c("N/A","Bad", "Good", "Better", "Best"))
dt$Attrition <- ordered(dt$Attrition,
levels = c("Stayed", "Left"))
# Convert other categorical variables to the correct type.
catcols <- c("Department", "EducationField", "Gender", "JobRole", "MaritalStatus")
dt %<>% mutate_at(catcols, factor)
A quick view to make sure the data was loaded correctly.
dim(dt)
## [1] 4410 26
# There are 4410 instances, or employees documented in the dataset, with 26 variables.
head(dt)
## Age Attrition BusinessTravel Department DistanceFromHome
## 1 51 Stayed Travel-Rarely Sales 6
## 2 31 Left Travel-Frequently Research & Development 10
## 3 32 Stayed Travel-Frequently Research & Development 17
## 4 38 Stayed Non-Travel Research & Development 2
## 5 32 Stayed Travel-Rarely Research & Development 10
## 6 46 Stayed Travel-Rarely Research & Development 8
## Education EducationField Gender JobLevel JobRole
## 1 College Life Sciences Female 1 Healthcare Representative
## 2 Below College Life Sciences Female 1 Research Scientist
## 3 Master Other Male 4 Sales Executive
## 4 Doctor Life Sciences Male 3 Human Resources
## 5 Below College Medical Male 1 Sales Executive
## 6 Bachelor Life Sciences Female 4 Research Director
## MaritalStatus MonthlyIncome NumCompaniesWorked PercentSalaryHike
## 1 Married 131160 1 11
## 2 Single 41890 0 23
## 3 Married 193280 1 15
## 4 Married 83210 3 11
## 5 Single 23420 4 12
## 6 Married 40710 3 13
## StockOptionLevel TotalWorkingYears TrainingTimesLastYear YearsAtCompany
## 1 0 1 6 1
## 2 1 6 3 5
## 3 3 5 2 5
## 4 3 13 5 8
## 5 2 9 2 6
## 6 0 28 5 7
## YearsSinceLastPromotion YearsWithCurrManager EnvironmentSatisfaction
## 1 0 0 High
## 2 1 4 High
## 3 0 3 Medium
## 4 7 5 Very High
## 5 0 4 Very High
## 6 7 7 High
## JobSatisfaction WorkLifeBalance JobInvolvement PerformanceRating AvgHrs
## 1 Very High Good High Excellent 7.37
## 2 Medium Best Medium Outstanding 7.72
## 3 Medium Bad High Excellent 7.01
## 4 Very High Better Medium Excellent 7.19
## 5 Low Better High Excellent 8.01
## 6 Medium Good High Excellent 10.80
str(dt)
## 'data.frame': 4410 obs. of 26 variables:
## $ Age : int 51 31 32 38 32 46 28 29 31 25 ...
## $ Attrition : Ord.factor w/ 2 levels "Stayed"<"Left": 1 2 1 1 1 1 2 1 1 1 ...
## $ BusinessTravel : Ord.factor w/ 3 levels "Non-Travel"<"Travel-Rarely"<..: 2 3 3 1 2 2 2 2 2 1 ...
## $ Department : Factor w/ 3 levels "Human Resources",..: 3 2 2 2 2 2 2 2 2 2 ...
## $ DistanceFromHome : int 6 10 17 2 10 8 11 18 1 7 ...
## $ Education : Ord.factor w/ 5 levels "Below College"<..: 2 1 4 5 1 3 2 3 3 4 ...
## $ EducationField : Factor w/ 6 levels "Human Resources",..: 2 2 5 2 4 2 4 2 2 4 ...
## $ Gender : Factor w/ 2 levels "Female","Male": 1 1 2 2 2 1 2 2 2 1 ...
## $ JobLevel : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1 1 4 3 1 4 2 2 3 4 ...
## $ JobRole : Factor w/ 9 levels "Healthcare Representative",..: 1 7 8 2 8 6 8 8 3 3 ...
## $ MaritalStatus : Factor w/ 3 levels "Divorced","Married",..: 2 3 2 2 3 2 3 2 2 1 ...
## $ MonthlyIncome : int 131160 41890 193280 83210 23420 40710 58130 31430 20440 134640 ...
## $ NumCompaniesWorked : int 1 0 1 3 4 3 2 2 0 1 ...
## $ PercentSalaryHike : int 11 23 15 11 12 13 20 22 21 13 ...
## $ StockOptionLevel : Ord.factor w/ 4 levels "0"<"1"<"2"<"3": 1 2 4 4 3 1 2 4 1 2 ...
## $ TotalWorkingYears : int 1 6 5 13 9 28 5 10 10 6 ...
## $ TrainingTimesLastYear : int 6 3 2 5 2 5 2 2 2 2 ...
## $ YearsAtCompany : int 1 5 5 8 6 7 0 0 9 6 ...
## $ YearsSinceLastPromotion: int 0 1 0 7 0 7 0 0 7 1 ...
## $ YearsWithCurrManager : int 0 4 3 5 4 7 0 0 8 5 ...
## $ EnvironmentSatisfaction: Ord.factor w/ 5 levels "N/A"<"Low"<"Medium"<..: 4 4 3 5 5 4 2 2 3 3 ...
## $ JobSatisfaction : Ord.factor w/ 5 levels "N/A"<"Low"<"Medium"<..: 5 3 3 5 2 3 4 3 5 2 ...
## $ WorkLifeBalance : Ord.factor w/ 5 levels "N/A"<"Bad"<"Good"<..: 3 5 2 4 4 3 2 4 4 4 ...
## $ JobInvolvement : Ord.factor w/ 4 levels "Low"<"Medium"<..: 3 2 3 2 3 3 3 3 3 3 ...
## $ PerformanceRating : Ord.factor w/ 4 levels "Low"<"Good"<"Excellent"<..: 3 4 3 3 3 3 4 4 4 3 ...
## $ AvgHrs : num 7.37 7.72 7.01 7.19 8.01 10.8 6.92 6.73 7.24 7.08 ...
Before diving into developing and comparing Machine Learning models, we must first determine how we will assess how the models are performing. Using multiple metrics to evaluate the models allows us to understand their strengths and weaknesses. The metric of focus is dependent on the purpose of model. The following metrics are commonly used in classification problems.
Confusion matrix
This matrix puts out correctly and incorrectly classified cases in a tabular format. For the binary (two-class) case the confusion matrix is organized as follows:
| Scored Positive | Scored Negative | |
|---|---|---|
| Actual Positive | True Positive | False Negative |
| Actual Negative | False Positive | True Negative |
In our model, “Left” category in the Attrition feature is defined as positive, and “Stayed” category is negative. Therefore our confusion matrix will be:
| Scored Left | Scored Stayed | |
|---|---|---|
| Actual Left | True Positive | False Negative |
| Actual Stayed | False Positive | True Negative |
Accuracy is the proportion of all correctly classified cases: \[Accuracy = \frac{TP+TN}{TP+FP+TN+FN}\] This metric can be misleading in imbalaned dataset like our data, and therefore not the best metric for measuring our model performance.
Precision, also called Positive Predictive Value, is the fraction of correctly classified positive cases out of all cases classified as positive:
\[Precision = \frac{TP}{TP+FP}\]
Sensitivity, also called Recall or True Positive Rate, is the proportion of true positive cases that are correctly identified.
\[Sensitivity = \frac{TP}{TP+FN}\]
Specificity, also called Selectivity or True Negative Rate, is the proportion of true negatives that are correctly identified.
\[Specificity = \frac{TN}{(TN+FP)}\]
Receiver Operating Characteristic(ROC) shows the tradeoff between True Positive Rate(Sensitivity) and False Positive Rate, while and Area Under Curve(AUC) is the integral of the ROC curve. The higher the AUC the lower the increase in false positive rate required to achieve a required true positive rate, and a classification model is considered to perform better if the AUC is higher.
For the company, failing to identify an employee will leave(False Negative) is more costly than incorrectly predicting that an employee will leave(False Positive). Therefore we must choose a metric that penalizes failure to identify positive cases (False Negatives) most heavily. In other words, Sensitivity will be more important than Specificity.
In our case, sensitivity will be the proportion of employees correctly classified at having left against all employees who actually left.
Accuracy is a poor metric for imbalanced data with differential consequences to incorrect classification. Similarly, while AUC is often used for binary classification, it’s also important to remember it can be a misleading if the data is not balanced.
Precision usually has an inverse relationship with Sensitivity and therefore we will focus on maximizing Sensitivity.
The first function will generate the confusion matrix and calculate metrics values, and the second will reveal which variables were deemed important for the model.
perf_met <- function(df) {
#Confusion Matrix Summary
cm <- suppressWarnings(confusionMatrix(data = as.factor(df$score),
reference = as.factor(df$Attrition),
positive = "Left"))
print(cm)
roc_obj <- roc(df$Attrition, df$probs)
cat(paste('AUC =', as.character(round(auc(roc_obj),3)),'\n'))
table <- data.frame(cm$table)
plotTable <- table %>%
mutate(Correctness = ifelse(table$Prediction == table$Reference, "Correct", "Incorrect")) %>%
group_by(Reference) %>%
mutate(Proportion = Freq/sum(Freq))
# Fill alpha relative to sensitivity/specificity by proportional outcomes within reference groups
ggplot(data = plotTable,
mapping = aes(x=Reference, y=Prediction, fill=Correctness, alpha=Proportion)) +
geom_tile() +
geom_text(aes(label=Freq), vjust=.5, fontface="bold", alpha=1) +
scale_fill_manual(values = c(Correct="#264d73", Incorrect="#b30000")) +
xlim(rev(levels(table$Reference))) +
ylim(levels(table$Prediction)) +
theme_light()
}
## Function to show which features are important.
feature_imp= function(mod) {
imp = varImp(mod)
plot <- ggplot(imp, aes(x=reorder(rownames(imp),Overall), y=Overall)) +
geom_point(color="skyblue", size=2, alpha=0.8) +
geom_segment(aes(x=rownames(imp), xend=rownames(imp), y=0, yend=Overall), color='skyblue') +
xlab('Variable') +
ylab('Overall Importance') +
theme_light() +
coord_flip()
print(anova(mod, test="Chisq"))
print(plot)
}
set.seed(1955)
## Randomly sample cases to create independent training and test data
partition = createDataPartition(dt[,'Attrition'], times = 1, p = 0.7, list = FALSE)
dt_train = dt[partition,] # Create the training sample
dim(dt_train)
## [1] 3088 26
dt_test = dt[-partition,] # Create the test sample
dim(dt_test)
## [1] 1322 26
numcols <- dt %>% select_if(is.numeric) %>% colnames
print(numcols)
## [1] "Age" "DistanceFromHome"
## [3] "MonthlyIncome" "NumCompaniesWorked"
## [5] "PercentSalaryHike" "TotalWorkingYears"
## [7] "TrainingTimesLastYear" "YearsAtCompany"
## [9] "YearsSinceLastPromotion" "YearsWithCurrManager"
## [11] "AvgHrs"
preProcValues <- preProcess(dt_train[,numcols], method = c("center", "scale"))
dt_train[,numcols] = predict(preProcValues, dt_train[,numcols])
dt_test[,numcols] = predict(preProcValues, dt_test[,numcols])
head(dt_train[,numcols])
## Age DistanceFromHome MonthlyIncome NumCompaniesWorked
## 1 1.5351693 -0.39949765 1.4088205 -0.6721110
## 3 -0.5410709 0.94741412 2.7295641 -0.6721110
## 5 -0.5410709 0.09028845 -0.8818574 0.5352647
## 6 0.9887903 -0.15460460 -0.5142519 0.1328061
## 7 -0.9781741 0.21273497 -0.1438824 -0.2696525
## 9 -0.6503467 -1.01173027 -0.9452157 -1.0745696
## PercentSalaryHike TotalWorkingYears TrainingTimesLastYear YearsAtCompany
## 1 -1.15935900 -1.3227615 2.5011973 -0.97857482
## 3 -0.06678233 -0.8096420 -0.6088076 -0.33730174
## 5 -0.88621483 -0.2965226 -0.6088076 -0.17698348
## 6 -0.61307067 2.1407948 1.7236961 -0.01666521
## 7 1.29893850 -0.8096420 -0.6088076 -1.13889309
## 9 1.57208267 -0.1682427 -0.6088076 0.30397133
## YearsSinceLastPromotion YearsWithCurrManager AvgHrs
## 1 -0.6824716 -1.16099726 -0.2538199
## 3 -0.6824716 -0.32872313 -0.5216240
## 5 -0.6824716 -0.05129842 0.2222764
## 6 1.4623824 0.78097571 2.2977583
## 7 -0.6824716 -1.16099726 -0.5885750
## 9 1.4623824 1.05840042 -0.3505269
head(dt_train)
## Age Attrition BusinessTravel Department
## 1 1.5351693 Stayed Travel-Rarely Sales
## 3 -0.5410709 Stayed Travel-Frequently Research & Development
## 5 -0.5410709 Stayed Travel-Rarely Research & Development
## 6 0.9887903 Stayed Travel-Rarely Research & Development
## 7 -0.9781741 Left Travel-Rarely Research & Development
## 9 -0.6503467 Stayed Travel-Rarely Research & Development
## DistanceFromHome Education EducationField Gender JobLevel
## 1 -0.39949765 College Life Sciences Female 1
## 3 0.94741412 Master Other Male 4
## 5 0.09028845 Below College Medical Male 1
## 6 -0.15460460 Bachelor Life Sciences Female 4
## 7 0.21273497 College Medical Male 2
## 9 -1.01173027 Bachelor Life Sciences Male 3
## JobRole MaritalStatus MonthlyIncome NumCompaniesWorked
## 1 Healthcare Representative Married 1.4088205 -0.6721110
## 3 Sales Executive Married 2.7295641 -0.6721110
## 5 Sales Executive Single -0.8818574 0.5352647
## 6 Research Director Married -0.5142519 0.1328061
## 7 Sales Executive Single -0.1438824 -0.2696525
## 9 Laboratory Technician Married -0.9452157 -1.0745696
## PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 1 -1.15935900 0 -1.3227615 2.5011973
## 3 -0.06678233 3 -0.8096420 -0.6088076
## 5 -0.88621483 2 -0.2965226 -0.6088076
## 6 -0.61307067 0 2.1407948 1.7236961
## 7 1.29893850 1 -0.8096420 -0.6088076
## 9 1.57208267 0 -0.1682427 -0.6088076
## YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager
## 1 -0.97857482 -0.6824716 -1.16099726
## 3 -0.33730174 -0.6824716 -0.32872313
## 5 -0.17698348 -0.6824716 -0.05129842
## 6 -0.01666521 1.4623824 0.78097571
## 7 -1.13889309 -0.6824716 -1.16099726
## 9 0.30397133 1.4623824 1.05840042
## EnvironmentSatisfaction JobSatisfaction WorkLifeBalance JobInvolvement
## 1 High Very High Good High
## 3 Medium Medium Bad High
## 5 Very High Low Better High
## 6 High Medium Good High
## 7 Low High Bad High
## 9 Medium Very High Better High
## PerformanceRating AvgHrs
## 1 Excellent -0.2538199
## 3 Excellent -0.5216240
## 5 Excellent 0.2222764
## 6 Excellent 2.2977583
## 7 Outstanding -0.5885750
## 9 Outstanding -0.3505269
Let’s start with Logistic Regression Model as requested by the client.
Create a copy of the partitioned datasets. (This step is not necessary but I prefer to keep the original dataset untouched.)
dLM_train <- dt_train
dLM_test <- dt_test
head(dLM_train)
## Age Attrition BusinessTravel Department
## 1 1.5351693 Stayed Travel-Rarely Sales
## 3 -0.5410709 Stayed Travel-Frequently Research & Development
## 5 -0.5410709 Stayed Travel-Rarely Research & Development
## 6 0.9887903 Stayed Travel-Rarely Research & Development
## 7 -0.9781741 Left Travel-Rarely Research & Development
## 9 -0.6503467 Stayed Travel-Rarely Research & Development
## DistanceFromHome Education EducationField Gender JobLevel
## 1 -0.39949765 College Life Sciences Female 1
## 3 0.94741412 Master Other Male 4
## 5 0.09028845 Below College Medical Male 1
## 6 -0.15460460 Bachelor Life Sciences Female 4
## 7 0.21273497 College Medical Male 2
## 9 -1.01173027 Bachelor Life Sciences Male 3
## JobRole MaritalStatus MonthlyIncome NumCompaniesWorked
## 1 Healthcare Representative Married 1.4088205 -0.6721110
## 3 Sales Executive Married 2.7295641 -0.6721110
## 5 Sales Executive Single -0.8818574 0.5352647
## 6 Research Director Married -0.5142519 0.1328061
## 7 Sales Executive Single -0.1438824 -0.2696525
## 9 Laboratory Technician Married -0.9452157 -1.0745696
## PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 1 -1.15935900 0 -1.3227615 2.5011973
## 3 -0.06678233 3 -0.8096420 -0.6088076
## 5 -0.88621483 2 -0.2965226 -0.6088076
## 6 -0.61307067 0 2.1407948 1.7236961
## 7 1.29893850 1 -0.8096420 -0.6088076
## 9 1.57208267 0 -0.1682427 -0.6088076
## YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager
## 1 -0.97857482 -0.6824716 -1.16099726
## 3 -0.33730174 -0.6824716 -0.32872313
## 5 -0.17698348 -0.6824716 -0.05129842
## 6 -0.01666521 1.4623824 0.78097571
## 7 -1.13889309 -0.6824716 -1.16099726
## 9 0.30397133 1.4623824 1.05840042
## EnvironmentSatisfaction JobSatisfaction WorkLifeBalance JobInvolvement
## 1 High Very High Good High
## 3 Medium Medium Bad High
## 5 Very High Low Better High
## 6 High Medium Good High
## 7 Low High Bad High
## 9 Medium Very High Better High
## PerformanceRating AvgHrs
## 1 Excellent -0.2538199
## 3 Excellent -0.5216240
## 5 Excellent 0.2222764
## 6 Excellent 2.2977583
## 7 Outstanding -0.5885750
## 9 Outstanding -0.3505269
head(dLM_test)
## Age Attrition BusinessTravel Department
## 2 -0.6503467 Left Travel-Frequently Research & Development
## 4 0.1145839 Stayed Non-Travel Research & Development
## 8 -0.8688983 Stayed Travel-Rarely Research & Development
## 14 1.0980661 Left Non-Travel Research & Development
## 17 -1.7431047 Stayed Travel-Rarely Research & Development
## 21 -1.1967257 Stayed Travel-Frequently Research & Development
## DistanceFromHome Education EducationField Gender JobLevel
## 2 0.09028845 Below College Life Sciences Female 1
## 4 -0.88928374 Doctor Life Sciences Male 3
## 8 1.06986064 Bachelor Life Sciences Male 2
## 14 -1.01173027 Below College Medical Male 1
## 17 -0.76683722 College Life Sciences Male 1
## 21 -1.01173027 Master Other Male 2
## JobRole MaritalStatus MonthlyIncome NumCompaniesWorked
## 2 Research Scientist Single -0.4891637 -1.0745696
## 4 Human Resources Married 0.3893476 0.1328061
## 8 Sales Executive Married -0.7115555 -0.2696525
## 14 Research Scientist Married -0.1547256 -0.6721110
## 17 Laboratory Technician Single -0.4840610 -0.6721110
## 21 Laboratory Technician Divorced 0.8413600 -0.6721110
## PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 2 2.1183710 1 -0.6813621 0.1686936
## 4 -1.1593590 3 0.2165969 1.7236961
## 8 1.8452268 3 -0.1682427 -0.6088076
## 14 -1.1593590 2 -0.1682427 0.9461948
## 17 -0.8862148 3 -1.0662017 0.1686936
## 21 0.7526502 0 -0.6813621 0.1686936
## YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager
## 2 -0.3373017 -0.3760639 -0.05129842
## 4 0.1436531 1.4623824 0.22612629
## 8 -1.1388931 -0.6824716 -1.16099726
## 14 0.4642896 2.0751978 1.33582514
## 17 -0.6579383 -0.3760639 -1.16099726
## 21 -0.1769835 -0.3760639 -0.05129842
## EnvironmentSatisfaction JobSatisfaction WorkLifeBalance JobInvolvement
## 2 High Medium Best Medium
## 4 Very High Very High Better Medium
## 8 Low Medium Better High
## 14 Low Medium Good Medium
## 17 Very High High Best Medium
## 21 High Medium Bad High
## PerformanceRating AvgHrs
## 2 Outstanding 0.006545263
## 4 Excellent -0.387721916
## 8 Outstanding -0.729916072
## 14 Excellent 1.256297831
## 17 Excellent -0.811745109
## 21 Excellent -0.090161781
set.seed(1955)
glm_mod = glm(Attrition ~ .,
family = binomial, data = dLM_train,
)
Let’s look at the summary of the model.
anova(glm_mod, test="Chisq")
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 2728.4
## Age 1 70.067 3086 2658.3 < 2.2e-16 ***
## BusinessTravel 2 50.880 3084 2607.4 8.946e-12 ***
## Department 2 33.530 3082 2573.9 5.237e-08 ***
## DistanceFromHome 1 0.001 3081 2573.9 0.98070
## Education 4 9.497 3077 2564.4 0.04981 *
## EducationField 5 7.849 3072 2556.5 0.16478
## Gender 1 1.281 3071 2555.3 0.25770
## JobLevel 4 5.836 3067 2549.4 0.21171
## JobRole 8 13.019 3059 2536.4 0.11120
## MaritalStatus 2 61.618 3057 2474.8 4.166e-14 ***
## MonthlyIncome 1 2.335 3056 2472.4 0.12651
## NumCompaniesWorked 1 45.589 3055 2426.9 1.459e-11 ***
## PercentSalaryHike 1 2.458 3054 2424.4 0.11691
## StockOptionLevel 3 2.660 3051 2421.7 0.44700
## TotalWorkingYears 1 41.635 3050 2380.1 1.100e-10 ***
## TrainingTimesLastYear 1 6.517 3049 2373.6 0.01069 *
## YearsAtCompany 1 0.048 3048 2373.6 0.82728
## YearsSinceLastPromotion 1 23.943 3047 2349.6 9.924e-07 ***
## YearsWithCurrManager 1 31.283 3046 2318.3 2.230e-08 ***
## EnvironmentSatisfaction 4 58.048 3042 2260.3 7.454e-12 ***
## JobSatisfaction 4 46.556 3038 2213.7 1.887e-09 ***
## WorkLifeBalance 4 26.197 3034 2187.5 2.888e-05 ***
## JobInvolvement 3 9.117 3031 2178.4 0.02777 *
## PerformanceRating 1 0.408 3030 2178.0 0.52308
## AvgHrs 1 130.107 3029 2047.9 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Using the trained model, make predictions for the test data.
dLM_test %<>% mutate(probs= predict(glm_mod, newdata=dLM_test, type = 'response'))
score_model = function(df, threshold){
df %<>% mutate(score = ifelse(probs < threshold, "Stayed", "Left"))
}
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.24513583 Stayed
## 2 Stayed 0.00343613 Stayed
## 3 Stayed 0.19882036 Stayed
## 4 Left 0.15731848 Stayed
## 5 Stayed 0.29864046 Stayed
## 6 Stayed 0.21648992 Stayed
## 7 Stayed 0.11884501 Stayed
## 8 Stayed 0.06457121 Stayed
## 9 Stayed 0.83219361 Left
## 10 Stayed 0.03177042 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 1084 156
## Left 25 57
##
## Accuracy : 0.8631
## 95% CI : (0.8434, 0.8812)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 0.00824
##
## Kappa : 0.3261
##
## Mcnemar's Test P-Value : < 2e-16
##
## Sensitivity : 0.26761
## Specificity : 0.97746
## Pos Pred Value : 0.69512
## Neg Pred Value : 0.87419
## Prevalence : 0.16112
## Detection Rate : 0.04312
## Detection Prevalence : 0.06203
## Balanced Accuracy : 0.62253
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.825
feature_imp(glm_mod)
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 2728.4
## Age 1 70.067 3086 2658.3 < 2.2e-16 ***
## BusinessTravel 2 50.880 3084 2607.4 8.946e-12 ***
## Department 2 33.530 3082 2573.9 5.237e-08 ***
## DistanceFromHome 1 0.001 3081 2573.9 0.98070
## Education 4 9.497 3077 2564.4 0.04981 *
## EducationField 5 7.849 3072 2556.5 0.16478
## Gender 1 1.281 3071 2555.3 0.25770
## JobLevel 4 5.836 3067 2549.4 0.21171
## JobRole 8 13.019 3059 2536.4 0.11120
## MaritalStatus 2 61.618 3057 2474.8 4.166e-14 ***
## MonthlyIncome 1 2.335 3056 2472.4 0.12651
## NumCompaniesWorked 1 45.589 3055 2426.9 1.459e-11 ***
## PercentSalaryHike 1 2.458 3054 2424.4 0.11691
## StockOptionLevel 3 2.660 3051 2421.7 0.44700
## TotalWorkingYears 1 41.635 3050 2380.1 1.100e-10 ***
## TrainingTimesLastYear 1 6.517 3049 2373.6 0.01069 *
## YearsAtCompany 1 0.048 3048 2373.6 0.82728
## YearsSinceLastPromotion 1 23.943 3047 2349.6 9.924e-07 ***
## YearsWithCurrManager 1 31.283 3046 2318.3 2.230e-08 ***
## EnvironmentSatisfaction 4 58.048 3042 2260.3 7.454e-12 ***
## JobSatisfaction 4 46.556 3038 2213.7 1.887e-09 ***
## WorkLifeBalance 4 26.197 3034 2187.5 2.888e-05 ***
## JobInvolvement 3 9.117 3031 2178.4 0.02777 *
## PerformanceRating 1 0.408 3030 2178.0 0.52308
## AvgHrs 1 130.107 3029 2047.9 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Insights
Now that we have an idea what a baseline logistic model looks like, we can try to improve the model by feature selection and hyperparameter tuning.
Before we change the model itself, we must deal with sample imbalance by increasing the weight of “Left” cases.
## Create a weight vector for the training cases.
sum(length(dLM_train$Attrition[dLM_train$Attrition=="Left"]))/nrow(dLM_train)*100
## [1] 16.12694
## 16% of training data Left and 84% Stayed.
weights = ifelse(dLM_train$Attrition == 'Left', 0.84, 0.16)
How the model performance changes after correcting the imbalance?
## GLM with weights
glm_mod_w = glm(Attrition ~ .,
family = quasibinomial, data = dLM_train,
weights = weights)
dLM_test %<>% mutate(probs= predict(glm_mod_w, newdata=dLM_test, type = 'response'))
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:20, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.665464915 Left
## 2 Stayed 0.009372544 Stayed
## 3 Stayed 0.580036990 Left
## 4 Left 0.389817745 Stayed
## 5 Stayed 0.626400169 Left
## 6 Stayed 0.503305464 Left
## 7 Stayed 0.357322937 Stayed
## 8 Stayed 0.249437938 Stayed
## 9 Stayed 0.967494887 Left
## 10 Stayed 0.167477506 Stayed
## 11 Stayed 0.317412158 Stayed
## 12 Stayed 0.309383026 Stayed
## 13 Left 0.699540461 Left
## 14 Stayed 0.255447103 Stayed
## 15 Stayed 0.564857332 Left
## 16 Stayed 0.094538473 Stayed
## 17 Stayed 0.456090380 Stayed
## 18 Stayed 0.476937353 Stayed
## 19 Stayed 0.201687751 Stayed
## 20 Stayed 0.497591008 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 822 52
## Left 287 161
##
## Accuracy : 0.7436
## 95% CI : (0.7191, 0.7669)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3438
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7559
## Specificity : 0.7412
## Pos Pred Value : 0.3594
## Neg Pred Value : 0.9405
## Prevalence : 0.1611
## Detection Rate : 0.1218
## Detection Prevalence : 0.3389
## Balanced Accuracy : 0.7485
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.819
feature_imp(glm_mod)
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 2728.4
## Age 1 70.067 3086 2658.3 < 2.2e-16 ***
## BusinessTravel 2 50.880 3084 2607.4 8.946e-12 ***
## Department 2 33.530 3082 2573.9 5.237e-08 ***
## DistanceFromHome 1 0.001 3081 2573.9 0.98070
## Education 4 9.497 3077 2564.4 0.04981 *
## EducationField 5 7.849 3072 2556.5 0.16478
## Gender 1 1.281 3071 2555.3 0.25770
## JobLevel 4 5.836 3067 2549.4 0.21171
## JobRole 8 13.019 3059 2536.4 0.11120
## MaritalStatus 2 61.618 3057 2474.8 4.166e-14 ***
## MonthlyIncome 1 2.335 3056 2472.4 0.12651
## NumCompaniesWorked 1 45.589 3055 2426.9 1.459e-11 ***
## PercentSalaryHike 1 2.458 3054 2424.4 0.11691
## StockOptionLevel 3 2.660 3051 2421.7 0.44700
## TotalWorkingYears 1 41.635 3050 2380.1 1.100e-10 ***
## TrainingTimesLastYear 1 6.517 3049 2373.6 0.01069 *
## YearsAtCompany 1 0.048 3048 2373.6 0.82728
## YearsSinceLastPromotion 1 23.943 3047 2349.6 9.924e-07 ***
## YearsWithCurrManager 1 31.283 3046 2318.3 2.230e-08 ***
## EnvironmentSatisfaction 4 58.048 3042 2260.3 7.454e-12 ***
## JobSatisfaction 4 46.556 3038 2213.7 1.887e-09 ***
## WorkLifeBalance 4 26.197 3034 2187.5 2.888e-05 ***
## JobInvolvement 3 9.117 3031 2178.4 0.02777 *
## PerformanceRating 1 0.408 3030 2178.0 0.52308
## AvgHrs 1 130.107 3029 2047.9 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Insight
We can now continue with the model tuning.
One concern with a model that performs successful is that there is over-fitting of the model. Reducing features can assist in reducing multi-colinearity and increasing generalization of the model.
glm_mod_1 = glm(Attrition ~
Age +
BusinessTravel +
Department +
DistanceFromHome +
Education +
EducationField +
Gender +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
PercentSalaryHike +
StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
PerformanceRating +
AvgHrs,
data = dLM_train,
family = quasibinomial,
weights = weights)
dLM_test$probs= predict(glm_mod_1, newdata=dLM_test, type = 'response')
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.665464915 Left
## 2 Stayed 0.009372544 Stayed
## 3 Stayed 0.580036990 Left
## 4 Left 0.389817745 Stayed
## 5 Stayed 0.626400169 Left
## 6 Stayed 0.503305464 Left
## 7 Stayed 0.357322937 Stayed
## 8 Stayed 0.249437938 Stayed
## 9 Stayed 0.967494887 Left
## 10 Stayed 0.167477506 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 822 52
## Left 287 161
##
## Accuracy : 0.7436
## 95% CI : (0.7191, 0.7669)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3438
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7559
## Specificity : 0.7412
## Pos Pred Value : 0.3594
## Neg Pred Value : 0.9405
## Prevalence : 0.1611
## Detection Rate : 0.1218
## Detection Prevalence : 0.3389
## Balanced Accuracy : 0.7485
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.819
feature_imp(glm_mod_1)
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 1154.38
## Age 1 30.941 3086 1123.44 < 2.2e-16 ***
## BusinessTravel 2 24.042 3084 1099.39 < 2.2e-16 ***
## Department 2 15.516 3082 1083.88 3.194e-12 ***
## DistanceFromHome 1 0.019 3081 1083.86 0.8012431
## Education 4 4.812 3077 1079.05 0.0025061 **
## EducationField 5 2.892 3072 1076.15 0.0790612 .
## Gender 1 0.821 3071 1075.33 0.0942239 .
## JobLevel 4 2.990 3067 1072.34 0.0371593 *
## JobRole 8 6.925 3059 1065.42 0.0026463 **
## MaritalStatus 2 29.558 3057 1035.86 < 2.2e-16 ***
## MonthlyIncome 1 1.156 3056 1034.70 0.0469944 *
## NumCompaniesWorked 1 23.606 3055 1011.10 < 2.2e-16 ***
## PercentSalaryHike 1 0.982 3054 1010.12 0.0672017 .
## StockOptionLevel 3 1.360 3051 1008.76 0.2000052
## TotalWorkingYears 1 15.570 3050 993.19 3.130e-13 ***
## TrainingTimesLastYear 1 6.100 3049 987.09 5.066e-06 ***
## YearsAtCompany 1 0.319 3048 986.77 0.2968302
## YearsSinceLastPromotion 1 8.872 3047 977.90 3.758e-08 ***
## YearsWithCurrManager 1 24.768 3046 953.13 < 2.2e-16 ***
## EnvironmentSatisfaction 4 26.940 3042 926.19 < 2.2e-16 ***
## JobSatisfaction 4 21.179 3038 905.01 7.551e-15 ***
## WorkLifeBalance 4 11.976 3034 893.03 2.872e-08 ***
## JobInvolvement 3 6.166 3031 886.87 0.0001033 ***
## PerformanceRating 1 0.389 3030 886.48 0.2493721
## AvgHrs 1 52.665 3029 833.81 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 1 Performance
| Accuracy | 0.7436 |
| Sensitivity | 0.7559 |
| Specificity | 0.7412 |
| AUC | 0.819 |
For next step, I will remove DistanceFromHome among the model features because the p-value indicates it is not significant and it was not deemed an important variable.
glm_mod_2 = glm(Attrition ~
Age +
BusinessTravel +
Department +
# DistanceFromHome +
Education +
EducationField +
Gender +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
PercentSalaryHike +
StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
PerformanceRating +
AvgHrs,
data = dLM_train,
family = quasibinomial,
weights = weights)
dLM_test$probs= predict(glm_mod_2, newdata=dLM_test, type = 'response')
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.664231177 Left
## 2 Stayed 0.009255964 Stayed
## 3 Stayed 0.585345845 Left
## 4 Left 0.385019667 Stayed
## 5 Stayed 0.622988520 Left
## 6 Stayed 0.496550765 Stayed
## 7 Stayed 0.352931631 Stayed
## 8 Stayed 0.253555889 Stayed
## 9 Stayed 0.967401478 Left
## 10 Stayed 0.166238289 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 825 52
## Left 284 161
##
## Accuracy : 0.7458
## 95% CI : (0.7215, 0.7691)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3471
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7559
## Specificity : 0.7439
## Pos Pred Value : 0.3618
## Neg Pred Value : 0.9407
## Prevalence : 0.1611
## Detection Rate : 0.1218
## Detection Prevalence : 0.3366
## Balanced Accuracy : 0.7499
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.819
feature_imp(glm_mod_2)
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 1154.38
## Age 1 30.941 3086 1123.44 < 2.2e-16 ***
## BusinessTravel 2 24.042 3084 1099.39 < 2.2e-16 ***
## Department 2 15.516 3082 1083.88 2.891e-12 ***
## Education 4 4.788 3078 1079.09 0.0025290 **
## EducationField 5 2.931 3073 1076.16 0.0741386 .
## Gender 1 0.800 3072 1075.36 0.0978385 .
## JobLevel 4 2.994 3068 1072.36 0.0363614 *
## JobRole 8 6.940 3060 1065.42 0.0025068 **
## MaritalStatus 2 29.506 3058 1035.92 < 2.2e-16 ***
## MonthlyIncome 1 1.168 3057 1034.75 0.0454666 *
## NumCompaniesWorked 1 23.410 3056 1011.34 < 2.2e-16 ***
## PercentSalaryHike 1 1.008 3055 1010.33 0.0631103 .
## StockOptionLevel 3 1.360 3052 1008.97 0.1986681
## TotalWorkingYears 1 15.653 3051 993.32 2.449e-13 ***
## TrainingTimesLastYear 1 6.100 3050 987.22 4.864e-06 ***
## YearsAtCompany 1 0.346 3049 986.87 0.2764910
## YearsSinceLastPromotion 1 8.859 3048 978.01 3.626e-08 ***
## YearsWithCurrManager 1 24.716 3047 953.30 < 2.2e-16 ***
## EnvironmentSatisfaction 4 27.028 3043 926.27 < 2.2e-16 ***
## JobSatisfaction 4 21.249 3039 905.02 5.892e-15 ***
## WorkLifeBalance 4 11.984 3035 893.04 2.633e-08 ***
## JobInvolvement 3 6.162 3032 886.88 0.0001002 ***
## PerformanceRating 1 0.389 3031 886.49 0.2482443
## AvgHrs 1 52.626 3030 833.86 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 2 Performance
| Accuracy | 0.7458 |
| Sensitivity | 0.7559 |
| Specificity | 0.7439 |
| AUC | 0.819 |
glm_mod_3 = glm(Attrition ~
Age +
BusinessTravel +
Department +
# DistanceFromHome +
Education +
EducationField +
Gender +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
PercentSalaryHike +
# StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
PerformanceRating +
AvgHrs,
data = dLM_train,
family = quasibinomial,
weights = weights)
dLM_test$probs= predict(glm_mod_3, newdata=dLM_test, type = 'response')
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.671471866 Left
## 2 Stayed 0.009654317 Stayed
## 3 Stayed 0.595584262 Left
## 4 Left 0.359600849 Stayed
## 5 Stayed 0.625388900 Left
## 6 Stayed 0.491898556 Stayed
## 7 Stayed 0.347320359 Stayed
## 8 Stayed 0.261389962 Stayed
## 9 Stayed 0.967605237 Left
## 10 Stayed 0.164727824 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 829 50
## Left 280 163
##
## Accuracy : 0.7504
## 95% CI : (0.7261, 0.7735)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.357
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7653
## Specificity : 0.7475
## Pos Pred Value : 0.3679
## Neg Pred Value : 0.9431
## Prevalence : 0.1611
## Detection Rate : 0.1233
## Detection Prevalence : 0.3351
## Balanced Accuracy : 0.7564
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.819
feature_imp(glm_mod_3)
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 1154.38
## Age 1 30.941 3086 1123.44 < 2.2e-16 ***
## BusinessTravel 2 24.042 3084 1099.39 < 2.2e-16 ***
## Department 2 15.516 3082 1083.88 2.966e-12 ***
## Education 4 4.788 3078 1079.09 0.002547 **
## EducationField 5 2.931 3073 1076.16 0.074410 .
## Gender 1 0.800 3072 1075.36 0.098001 .
## JobLevel 4 2.994 3068 1072.36 0.036512 *
## JobRole 8 6.940 3060 1065.42 0.002529 **
## MaritalStatus 2 29.506 3058 1035.92 < 2.2e-16 ***
## MonthlyIncome 1 1.168 3057 1034.75 0.045571 *
## NumCompaniesWorked 1 23.410 3056 1011.34 < 2.2e-16 ***
## PercentSalaryHike 1 1.008 3055 1010.33 0.063238 .
## TotalWorkingYears 1 15.263 3054 995.07 4.958e-13 ***
## TrainingTimesLastYear 1 6.337 3053 988.73 3.216e-06 ***
## YearsAtCompany 1 0.341 3052 988.39 0.280200
## YearsSinceLastPromotion 1 8.799 3051 979.59 4.092e-08 ***
## YearsWithCurrManager 1 24.446 3050 955.15 < 2.2e-16 ***
## EnvironmentSatisfaction 4 27.763 3046 927.38 < 2.2e-16 ***
## JobSatisfaction 4 21.309 3042 906.07 5.515e-15 ***
## WorkLifeBalance 4 12.015 3038 894.06 2.551e-08 ***
## JobInvolvement 3 6.515 3035 887.54 5.671e-05 ***
## PerformanceRating 1 0.407 3034 887.14 0.238261
## AvgHrs 1 52.917 3033 834.22 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 3 Performance
| Accuracy | 0.7504 |
| Sensitivity | 0.7653 |
| Specificity | 0.7475 |
| AUC | 0.819 |
* All metric calculations showed improvements so we will keep this feature out. * I will continue this process feature by feature to arrive at the simplest model.
glm_mod_4 = glm(Attrition ~
Age +
BusinessTravel +
Department +
# DistanceFromHome +
Education +
EducationField +
Gender +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
PercentSalaryHike +
# StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
# YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
PerformanceRating +
AvgHrs,
data = dLM_train,
family = quasibinomial,
weights = weights)
dLM_test$probs= predict(glm_mod_4, newdata=dLM_test, type = 'response')
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.67704493 Left
## 2 Stayed 0.01264084 Stayed
## 3 Stayed 0.66417158 Left
## 4 Left 0.41947427 Stayed
## 5 Stayed 0.60319744 Left
## 6 Stayed 0.48401040 Stayed
## 7 Stayed 0.23249550 Stayed
## 8 Stayed 0.25094262 Stayed
## 9 Stayed 0.96591209 Left
## 10 Stayed 0.12816160 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 813 52
## Left 296 161
##
## Accuracy : 0.7368
## 95% CI : (0.7121, 0.7603)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3343
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7559
## Specificity : 0.7331
## Pos Pred Value : 0.3523
## Neg Pred Value : 0.9399
## Prevalence : 0.1611
## Detection Rate : 0.1218
## Detection Prevalence : 0.3457
## Balanced Accuracy : 0.7445
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.82
feature_imp(glm_mod_4)
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 1154.38
## Age 1 30.941 3086 1123.44 < 2.2e-16 ***
## BusinessTravel 2 24.042 3084 1099.39 7.865e-16 ***
## Department 2 15.516 3082 1083.88 1.788e-10 ***
## Education 4 4.788 3078 1079.09 0.0077808 **
## EducationField 5 2.931 3073 1076.16 0.1316630
## Gender 1 0.800 3072 1075.36 0.1281293
## JobLevel 4 2.994 3068 1072.36 0.0701219 .
## JobRole 8 6.940 3060 1065.42 0.0100451 *
## MaritalStatus 2 29.506 3058 1035.92 < 2.2e-16 ***
## MonthlyIncome 1 1.168 3057 1034.75 0.0659872 .
## NumCompaniesWorked 1 23.410 3056 1011.34 < 2.2e-16 ***
## PercentSalaryHike 1 1.008 3055 1010.33 0.0876195 .
## TotalWorkingYears 1 15.263 3054 995.07 3.030e-11 ***
## TrainingTimesLastYear 1 6.337 3053 988.73 1.853e-05 ***
## YearsSinceLastPromotion 1 8.107 3052 980.62 1.279e-06 ***
## YearsWithCurrManager 1 22.190 3051 958.43 1.125e-15 ***
## EnvironmentSatisfaction 4 28.146 3047 930.29 < 2.2e-16 ***
## JobSatisfaction 4 19.603 3043 910.69 1.420e-11 ***
## WorkLifeBalance 4 11.647 3039 899.04 8.600e-07 ***
## JobInvolvement 3 6.900 3036 892.14 0.0001728 ***
## PerformanceRating 1 0.440 3035 891.70 0.2590968
## AvgHrs 1 51.502 3034 840.20 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 4 Performance
| Accuracy | 0.7368 |
| Sensitivity | 0.7559 |
| Specificity | 0.7331 |
| AUC | 0.82 |
* All metrics deteriorated, so we will try adding the feature back and remove a different feature.
glm_mod_5 = glm(Attrition ~
Age +
BusinessTravel +
Department +
# DistanceFromHome +
Education +
EducationField +
Gender +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
PercentSalaryHike +
# StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
# PerformanceRating +
AvgHrs,
data = dLM_train,
family = quasibinomial,
weights = weights)
dLM_test$probs= predict(glm_mod_5, newdata=dLM_test, type = 'response')
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.673259822 Left
## 2 Stayed 0.009846031 Stayed
## 3 Stayed 0.600008249 Left
## 4 Left 0.361518746 Stayed
## 5 Stayed 0.626724031 Left
## 6 Stayed 0.488329322 Stayed
## 7 Stayed 0.346870581 Stayed
## 8 Stayed 0.262715362 Stayed
## 9 Stayed 0.967970060 Left
## 10 Stayed 0.165483647 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 829 49
## Left 280 164
##
## Accuracy : 0.7511
## 95% CI : (0.7269, 0.7742)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3598
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7700
## Specificity : 0.7475
## Pos Pred Value : 0.3694
## Neg Pred Value : 0.9442
## Prevalence : 0.1611
## Detection Rate : 0.1241
## Detection Prevalence : 0.3359
## Balanced Accuracy : 0.7587
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.819
feature_imp(glm_mod_5)
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 1154.38
## Age 1 30.941 3086 1123.44 < 2.2e-16 ***
## BusinessTravel 2 24.042 3084 1099.39 < 2.2e-16 ***
## Department 2 15.516 3082 1083.88 2.949e-12 ***
## Education 4 4.788 3078 1079.09 0.002543 **
## EducationField 5 2.931 3073 1076.16 0.074348 .
## Gender 1 0.800 3072 1075.36 0.097964 .
## JobLevel 4 2.994 3068 1072.36 0.036478 *
## JobRole 8 6.940 3060 1065.42 0.002524 **
## MaritalStatus 2 29.506 3058 1035.92 < 2.2e-16 ***
## MonthlyIncome 1 1.168 3057 1034.75 0.045547 *
## NumCompaniesWorked 1 23.410 3056 1011.34 < 2.2e-16 ***
## PercentSalaryHike 1 1.008 3055 1010.33 0.063209 .
## TotalWorkingYears 1 15.263 3054 995.07 4.929e-13 ***
## TrainingTimesLastYear 1 6.337 3053 988.73 3.208e-06 ***
## YearsAtCompany 1 0.341 3052 988.39 0.280147
## YearsSinceLastPromotion 1 8.799 3051 979.59 4.078e-08 ***
## YearsWithCurrManager 1 24.446 3050 955.15 < 2.2e-16 ***
## EnvironmentSatisfaction 4 27.763 3046 927.38 < 2.2e-16 ***
## JobSatisfaction 4 21.309 3042 906.07 5.472e-15 ***
## WorkLifeBalance 4 12.015 3038 894.06 2.540e-08 ***
## JobInvolvement 3 6.515 3035 887.54 5.657e-05 ***
## AvgHrs 1 53.308 3034 834.23 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 5 Performance
| Accuracy | 0.7511 |
| Sensitivity | 0.7700 |
| Specificity | 0.7475 |
| AUC | 0.819 |
glm_mod_6 = glm(Attrition ~
Age +
BusinessTravel +
Department +
# DistanceFromHome +
Education +
# EducationField +
Gender +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
PercentSalaryHike +
# StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
# YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
# PerformanceRating +
AvgHrs,
data = dLM_train,
family = quasibinomial,
weights = weights)
dLM_test$probs= predict(glm_mod_6, newdata=dLM_test, type = 'response')
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.6854979 Left
## 2 Stayed 0.0121785 Stayed
## 3 Stayed 0.6581509 Left
## 4 Left 0.3946230 Stayed
## 5 Stayed 0.6158727 Left
## 6 Stayed 0.5910461 Left
## 7 Stayed 0.2137386 Stayed
## 8 Stayed 0.2353871 Stayed
## 9 Stayed 0.9664472 Left
## 10 Stayed 0.1458530 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 804 50
## Left 305 163
##
## Accuracy : 0.7315
## 95% CI : (0.7067, 0.7552)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3304
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7653
## Specificity : 0.7250
## Pos Pred Value : 0.3483
## Neg Pred Value : 0.9415
## Prevalence : 0.1611
## Detection Rate : 0.1233
## Detection Prevalence : 0.3540
## Balanced Accuracy : 0.7451
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.82
feature_imp(glm_mod_6)
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 1154.38
## Age 1 30.941 3086 1123.44 < 2.2e-16 ***
## BusinessTravel 2 24.042 3084 1099.39 5.549e-16 ***
## Department 2 15.516 3082 1083.88 1.427e-10 ***
## Education 4 4.788 3078 1079.09 0.0073223 **
## Gender 1 0.737 3077 1078.35 0.1422847
## JobLevel 4 3.442 3073 1074.91 0.0394413 *
## JobRole 8 6.802 3065 1068.11 0.0108138 *
## MaritalStatus 2 29.317 3063 1038.79 < 2.2e-16 ***
## MonthlyIncome 1 1.172 3062 1037.62 0.0642607 .
## NumCompaniesWorked 1 22.497 3061 1015.12 5.143e-16 ***
## PercentSalaryHike 1 1.030 3060 1014.09 0.0827544 .
## TotalWorkingYears 1 15.169 3059 998.92 2.779e-11 ***
## TrainingTimesLastYear 1 6.145 3058 992.78 2.261e-05 ***
## YearsSinceLastPromotion 1 7.323 3057 985.46 3.729e-06 ***
## YearsWithCurrManager 1 22.400 3056 963.06 5.941e-16 ***
## EnvironmentSatisfaction 4 27.719 3052 935.34 < 2.2e-16 ***
## JobSatisfaction 4 20.177 3048 915.16 4.797e-12 ***
## WorkLifeBalance 4 12.165 3044 903.00 3.585e-07 ***
## JobInvolvement 3 6.712 3041 896.28 0.0002041 ***
## AvgHrs 1 52.532 3040 843.75 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 6 Performance
| Accuracy | 0.7315 |
| Sensitivity | 0.7653 |
| Specificity | 0.7250 |
| AUC | 0.82 |
glm_mod_7 = glm(Attrition ~
Age +
BusinessTravel +
Department +
# DistanceFromHome +
Education +
# EducationField +
# Gender +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
PercentSalaryHike +
# StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
# YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
# PerformanceRating +
AvgHrs,
data = dLM_train,
family = quasibinomial,
weights = weights)
dLM_test$probs= predict(glm_mod_7, newdata=dLM_test, type = 'response')
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.69424825 Left
## 2 Stayed 0.01176851 Stayed
## 3 Stayed 0.65325062 Left
## 4 Left 0.39006665 Stayed
## 5 Stayed 0.60879219 Left
## 6 Stayed 0.58609442 Left
## 7 Stayed 0.20657558 Stayed
## 8 Stayed 0.23000728 Stayed
## 9 Stayed 0.96820633 Left
## 10 Stayed 0.14285659 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 806 49
## Left 303 164
##
## Accuracy : 0.7337
## 95% CI : (0.709, 0.7574)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3352
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7700
## Specificity : 0.7268
## Pos Pred Value : 0.3512
## Neg Pred Value : 0.9427
## Prevalence : 0.1611
## Detection Rate : 0.1241
## Detection Prevalence : 0.3533
## Balanced Accuracy : 0.7484
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.82
feature_imp(glm_mod_7)
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 1154.38
## Age 1 30.941 3086 1123.44 < 2.2e-16 ***
## BusinessTravel 2 24.042 3084 1099.39 7.129e-16 ***
## Department 2 15.516 3082 1083.88 1.678e-10 ***
## Education 4 4.788 3078 1079.09 0.0076488 **
## JobLevel 4 3.267 3074 1075.82 0.0502004 .
## JobRole 8 6.841 3066 1068.98 0.0109263 *
## MaritalStatus 2 29.344 3064 1039.64 < 2.2e-16 ***
## MonthlyIncome 1 1.132 3063 1038.51 0.0699732 .
## NumCompaniesWorked 1 21.811 3062 1016.69 1.792e-15 ***
## PercentSalaryHike 1 1.034 3061 1015.66 0.0832511 .
## TotalWorkingYears 1 15.258 3060 1000.40 2.862e-11 ***
## TrainingTimesLastYear 1 6.287 3059 994.12 1.948e-05 ***
## YearsSinceLastPromotion 1 7.198 3058 986.92 4.878e-06 ***
## YearsWithCurrManager 1 22.859 3057 964.06 3.832e-16 ***
## EnvironmentSatisfaction 4 27.905 3053 936.15 < 2.2e-16 ***
## JobSatisfaction 4 20.346 3049 915.81 4.639e-12 ***
## WorkLifeBalance 4 12.290 3045 903.52 3.405e-07 ***
## JobInvolvement 3 6.648 3042 896.87 0.0002384 ***
## AvgHrs 1 52.974 3041 843.90 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 7 Performance
| Accuracy | 0.7337 |
| Sensitivity | 0.7700 |
| Specificity | 0.7268 |
| AUC | 0.82 |
glm_mod_8 = glm(Attrition ~
Age +
BusinessTravel +
Department +
# DistanceFromHome +
Education +
# EducationField +
# Gender +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
# PercentSalaryHike +
# StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
# YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
# PerformanceRating +
AvgHrs,
data = dLM_train,
family = quasibinomial,
weights = weights)
dLM_test$probs= predict(glm_mod_8, newdata=dLM_test, type = 'response')
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.64931169 Left
## 2 Stayed 0.01295795 Stayed
## 3 Stayed 0.60477348 Left
## 4 Left 0.42582268 Stayed
## 5 Stayed 0.63026618 Left
## 6 Stayed 0.57631750 Left
## 7 Stayed 0.20683100 Stayed
## 8 Stayed 0.25633616 Stayed
## 9 Stayed 0.95979072 Left
## 10 Stayed 0.16044485 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 806 51
## Left 303 162
##
## Accuracy : 0.7322
## 95% CI : (0.7075, 0.7559)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3297
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7606
## Specificity : 0.7268
## Pos Pred Value : 0.3484
## Neg Pred Value : 0.9405
## Prevalence : 0.1611
## Detection Rate : 0.1225
## Detection Prevalence : 0.3517
## Balanced Accuracy : 0.7437
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.82
feature_imp(glm_mod_8)
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 1154.38
## Age 1 30.941 3086 1123.44 < 2.2e-16 ***
## BusinessTravel 2 24.042 3084 1099.39 6.842e-15 ***
## Department 2 15.516 3082 1083.88 7.221e-10 ***
## Education 4 4.788 3078 1079.09 0.011320 *
## JobLevel 4 3.267 3074 1075.82 0.064604 .
## JobRole 8 6.841 3066 1068.98 0.017391 *
## MaritalStatus 2 29.344 3064 1039.64 < 2.2e-16 ***
## MonthlyIncome 1 1.132 3063 1038.51 0.079714 .
## NumCompaniesWorked 1 21.811 3062 1016.69 1.440e-14 ***
## TotalWorkingYears 1 15.497 3061 1001.20 8.908e-11 ***
## TrainingTimesLastYear 1 6.497 3060 994.70 2.687e-05 ***
## YearsSinceLastPromotion 1 7.185 3059 987.52 1.009e-05 ***
## YearsWithCurrManager 1 22.327 3058 965.19 7.073e-15 ***
## EnvironmentSatisfaction 4 27.119 3054 938.07 3.979e-15 ***
## JobSatisfaction 4 19.733 3050 918.34 6.569e-11 ***
## WorkLifeBalance 4 12.670 3046 905.67 6.241e-07 ***
## JobInvolvement 3 6.459 3043 899.21 0.000551 ***
## AvgHrs 1 53.772 3042 845.44 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 8 Performance
| Accuracy | 0.7322 |
| Sensitivity | 0.7606 |
| Specificity | 0.7268 |
| AUC | 0.82 |
glm_mod_9 = glm(Attrition ~
Age +
BusinessTravel +
Department +
# DistanceFromHome +
Education +
# EducationField +
# Gender +
# JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
# PercentSalaryHike +
# StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
# YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
# PerformanceRating +
AvgHrs,
data = dLM_train,
family = quasibinomial,
weights = weights)
dLM_test$probs= predict(glm_mod_9, newdata=dLM_test, type = 'response')
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.66321747 Left
## 2 Stayed 0.01608664 Stayed
## 3 Stayed 0.53482812 Left
## 4 Left 0.42939576 Stayed
## 5 Stayed 0.64524716 Left
## 6 Stayed 0.53042227 Left
## 7 Stayed 0.23004766 Stayed
## 8 Stayed 0.27717838 Stayed
## 9 Stayed 0.96519541 Left
## 10 Stayed 0.17150777 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 800 53
## Left 309 160
##
## Accuracy : 0.7262
## 95% CI : (0.7013, 0.7501)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3181
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7512
## Specificity : 0.7214
## Pos Pred Value : 0.3412
## Neg Pred Value : 0.9379
## Prevalence : 0.1611
## Detection Rate : 0.1210
## Detection Prevalence : 0.3548
## Balanced Accuracy : 0.7363
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.818
feature_imp(glm_mod_9)
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 1154.38
## Age 1 30.941 3086 1123.44 < 2.2e-16 ***
## BusinessTravel 2 24.042 3084 1099.39 1.761e-14 ***
## Department 2 15.516 3082 1083.88 1.329e-09 ***
## Education 4 4.788 3078 1079.09 0.0133214 *
## JobRole 8 6.725 3070 1072.36 0.0234434 *
## MaritalStatus 2 28.611 3068 1043.75 < 2.2e-16 ***
## MonthlyIncome 1 0.806 3067 1042.95 0.1450764
## NumCompaniesWorked 1 21.909 3066 1021.04 3.019e-14 ***
## TotalWorkingYears 1 14.403 3065 1006.64 7.268e-10 ***
## TrainingTimesLastYear 1 6.347 3064 1000.29 4.326e-05 ***
## YearsSinceLastPromotion 1 7.319 3063 992.97 1.127e-05 ***
## YearsWithCurrManager 1 21.784 3062 971.18 3.570e-14 ***
## EnvironmentSatisfaction 4 26.360 3058 944.82 2.970e-14 ***
## JobSatisfaction 4 20.645 3054 924.18 4.357e-11 ***
## WorkLifeBalance 4 12.694 3050 911.49 9.693e-07 ***
## JobInvolvement 3 6.938 3047 904.55 0.0003851 ***
## AvgHrs 1 53.177 3046 851.37 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 9 Performance
| Accuracy | 0.7262 |
| Sensitivity | 0.7512 |
| Specificity | 0.7214 |
| AUC | 0.818 |
glm_mod_10 = glm(Attrition ~
Age +
BusinessTravel +
Department +
# DistanceFromHome +
Education +
# EducationField +
# Gender +
# JobLevel +
JobRole +
MaritalStatus +
# MonthlyIncome +
NumCompaniesWorked +
# PercentSalaryHike +
# StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
# YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
# PerformanceRating +
AvgHrs,
data = dLM_train,
family = quasibinomial,
weights = weights)
dLM_test$probs= predict(glm_mod_10, newdata=dLM_test, type = 'response')
dLM_test = score_model(dLM_test, 0.5)
dLM_test[1:10, c('Attrition','probs','score')]
## Attrition probs score
## 1 Left 0.65830003 Left
## 2 Stayed 0.01630002 Stayed
## 3 Stayed 0.52236878 Left
## 4 Left 0.42621947 Stayed
## 5 Stayed 0.63128309 Left
## 6 Stayed 0.55196791 Left
## 7 Stayed 0.22995908 Stayed
## 8 Stayed 0.28095043 Stayed
## 9 Stayed 0.97006920 Left
## 10 Stayed 0.16200878 Stayed
perf_met(dLM_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 801 54
## Left 308 159
##
## Accuracy : 0.7262
## 95% CI : (0.7013, 0.7501)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3164
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7465
## Specificity : 0.7223
## Pos Pred Value : 0.3405
## Neg Pred Value : 0.9368
## Prevalence : 0.1611
## Detection Rate : 0.1203
## Detection Prevalence : 0.3533
## Balanced Accuracy : 0.7344
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.817
feature_imp(glm_mod_10)
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: Attrition
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 3087 1154.38
## Age 1 30.941 3086 1123.44 < 2.2e-16 ***
## BusinessTravel 2 24.042 3084 1099.39 3.320e-14 ***
## Department 2 15.516 3082 1083.88 2.001e-09 ***
## Education 4 4.788 3078 1079.09 0.0148543 *
## JobRole 8 6.725 3070 1072.36 0.0265436 *
## MaritalStatus 2 28.611 3068 1043.75 < 2.2e-16 ***
## NumCompaniesWorked 1 22.088 3067 1021.67 4.300e-14 ***
## TotalWorkingYears 1 14.476 3066 1007.19 9.747e-10 ***
## TrainingTimesLastYear 1 6.496 3065 1000.69 4.217e-05 ***
## YearsSinceLastPromotion 1 7.115 3064 993.58 1.820e-05 ***
## YearsWithCurrManager 1 21.761 3063 971.82 6.603e-14 ***
## EnvironmentSatisfaction 4 26.480 3059 945.34 5.021e-14 ***
## JobSatisfaction 4 20.427 3055 924.91 9.670e-11 ***
## WorkLifeBalance 4 12.780 3051 912.13 1.197e-06 ***
## JobInvolvement 3 6.967 3048 905.17 0.0004427 ***
## AvgHrs 1 52.884 3047 852.28 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 10 Performance
| Accuracy | 0.7262 |
| Sensitivity | 0.7465 |
| Specificity | 0.7223 |
| AUC | 0.818 |
The model performance measured by sensitivity seems to deteriorate with further feature removal after Model 7. Also starting from Model 7’s iteration onward, the remaining variables were shown to be statistically significant. Therefore I have determined that Model 7 to be my most predictive model.
Subsequently we will look at whether the threshold of 0.5 is appropriate. If the imbalance correction of the label features was successful, then this value would not need to be changed significantly.
test_threshold = function(test, threshold){
test$score = predict(glm_mod_7, newdata = test, type = 'response')
test = score_model(test, t)
cat('\n')
cat(paste('For threshold = ', as.character(threshold), '\n'))
print(perf_met(test))
}
thresholds = seq(0.1, 0.9, by = 0.1)
for(t in thresholds) test_threshold(dLM_test, t) # Iterate over the thresholds
##
## For threshold = 0.1
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 209 4
## Left 900 209
##
## Accuracy : 0.3162
## 95% CI : (0.2912, 0.342)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.0629
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.9812
## Specificity : 0.1885
## Pos Pred Value : 0.1885
## Neg Pred Value : 0.9812
## Prevalence : 0.1611
## Detection Rate : 0.1581
## Detection Prevalence : 0.8389
## Balanced Accuracy : 0.5848
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.817
##
## For threshold = 0.2
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 404 10
## Left 705 203
##
## Accuracy : 0.4592
## 95% CI : (0.432, 0.4865)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.1369
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.9531
## Specificity : 0.3643
## Pos Pred Value : 0.2236
## Neg Pred Value : 0.9758
## Prevalence : 0.1611
## Detection Rate : 0.1536
## Detection Prevalence : 0.6868
## Balanced Accuracy : 0.6587
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.817
##
## For threshold = 0.3
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 553 26
## Left 556 187
##
## Accuracy : 0.5598
## 95% CI : (0.5325, 0.5867)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.1878
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.8779
## Specificity : 0.4986
## Pos Pred Value : 0.2517
## Neg Pred Value : 0.9551
## Prevalence : 0.1611
## Detection Rate : 0.1415
## Detection Prevalence : 0.5620
## Balanced Accuracy : 0.6883
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.817
##
## For threshold = 0.4
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 684 35
## Left 425 178
##
## Accuracy : 0.652
## 95% CI : (0.6257, 0.6777)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.2601
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.8357
## Specificity : 0.6168
## Pos Pred Value : 0.2952
## Neg Pred Value : 0.9513
## Prevalence : 0.1611
## Detection Rate : 0.1346
## Detection Prevalence : 0.4561
## Balanced Accuracy : 0.7262
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.817
##
## For threshold = 0.5
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 801 54
## Left 308 159
##
## Accuracy : 0.7262
## 95% CI : (0.7013, 0.7501)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3164
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.7465
## Specificity : 0.7223
## Pos Pred Value : 0.3405
## Neg Pred Value : 0.9368
## Prevalence : 0.1611
## Detection Rate : 0.1203
## Detection Prevalence : 0.3533
## Balanced Accuracy : 0.7344
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.817
##
## For threshold = 0.6
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 913 70
## Left 196 143
##
## Accuracy : 0.7988
## 95% CI : (0.7761, 0.8201)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 0.9999
##
## Kappa : 0.3992
##
## Mcnemar's Test P-Value : 1.799e-14
##
## Sensitivity : 0.6714
## Specificity : 0.8233
## Pos Pred Value : 0.4218
## Neg Pred Value : 0.9288
## Prevalence : 0.1611
## Detection Rate : 0.1082
## Detection Prevalence : 0.2564
## Balanced Accuracy : 0.7473
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.817
##
## For threshold = 0.7
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 992 96
## Left 117 117
##
## Accuracy : 0.8389
## 95% CI : (0.8179, 0.8583)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 0.5183
##
## Kappa : 0.4268
##
## Mcnemar's Test P-Value : 0.1706
##
## Sensitivity : 0.5493
## Specificity : 0.8945
## Pos Pred Value : 0.5000
## Neg Pred Value : 0.9118
## Prevalence : 0.1611
## Detection Rate : 0.0885
## Detection Prevalence : 0.1770
## Balanced Accuracy : 0.7219
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.817
##
## For threshold = 0.8
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 1055 143
## Left 54 70
##
## Accuracy : 0.851
## 95% CI : (0.8306, 0.8698)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 0.1225
##
## Kappa : 0.3368
##
## Mcnemar's Test P-Value : 3.617e-10
##
## Sensitivity : 0.32864
## Specificity : 0.95131
## Pos Pred Value : 0.56452
## Neg Pred Value : 0.88063
## Prevalence : 0.16112
## Detection Rate : 0.05295
## Detection Prevalence : 0.09380
## Balanced Accuracy : 0.63997
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.817
##
## For threshold = 0.9
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 1100 192
## Left 9 21
##
## Accuracy : 0.848
## 95% CI : (0.8275, 0.8669)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 0.1954
##
## Kappa : 0.1386
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.09859
## Specificity : 0.99188
## Pos Pred Value : 0.70000
## Neg Pred Value : 0.85139
## Prevalence : 0.16112
## Detection Rate : 0.01589
## Detection Prevalence : 0.02269
## Balanced Accuracy : 0.54524
##
## 'Positive' Class : Left
##
## Setting levels: control = Stayed, case = Left
## Setting direction: controls < cases
## AUC = 0.817
This test indicates that a threshold of 0.5 is adequate for our model.
Using our final model glm_mod_7, I will perform a repeated cross-validation of the entire dataset to examine whether this model is generalized well.
weights = ifelse(dt$Attrition == 'Left', 0.84, 0.16)
control <- trainControl(method = "repeatedcv",
number = 5,
repeats = 3,
returnResamp ="all",
savePredictions = TRUE,
classProbs = TRUE,
summaryFunction = twoClassSummary)
set.seed(1955)
glm_final <- train(Attrition ~
Age +
BusinessTravel +
Department +
Education +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
PercentSalaryHike +
TotalWorkingYears +
TrainingTimesLastYear +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
AvgHrs,
data=dt,
method="glm",
metric= "Recall",
weights = weights,
trControl=control)
glm_final
## Generalized Linear Model
##
## 4410 samples
## 19 predictor
## 2 classes: 'Stayed', 'Left'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 3 times)
## Summary of sample sizes: 3527, 3528, 3529, 3528, 3528, 3528, ...
## Resampling results:
##
## ROC Sens Spec
## 0.8195934 0.7368653 0.7627663
While the average of the metric sensitivity seems to have degraded slightly, this is as expected due to multiple re-sampling for cross validation. I conclude that our final Logistic Regression model is able to generalize reasonably well.
Next, let’s take a look at an emsemble classification model, Random Forest.
Create a copy of the partitioned, scaled datasets as before.
dRF_train <- dt_train
dRF_test <- dt_test
head(dRF_train)
## Age Attrition BusinessTravel Department
## 1 1.5351693 Stayed Travel-Rarely Sales
## 3 -0.5410709 Stayed Travel-Frequently Research & Development
## 5 -0.5410709 Stayed Travel-Rarely Research & Development
## 6 0.9887903 Stayed Travel-Rarely Research & Development
## 7 -0.9781741 Left Travel-Rarely Research & Development
## 9 -0.6503467 Stayed Travel-Rarely Research & Development
## DistanceFromHome Education EducationField Gender JobLevel
## 1 -0.39949765 College Life Sciences Female 1
## 3 0.94741412 Master Other Male 4
## 5 0.09028845 Below College Medical Male 1
## 6 -0.15460460 Bachelor Life Sciences Female 4
## 7 0.21273497 College Medical Male 2
## 9 -1.01173027 Bachelor Life Sciences Male 3
## JobRole MaritalStatus MonthlyIncome NumCompaniesWorked
## 1 Healthcare Representative Married 1.4088205 -0.6721110
## 3 Sales Executive Married 2.7295641 -0.6721110
## 5 Sales Executive Single -0.8818574 0.5352647
## 6 Research Director Married -0.5142519 0.1328061
## 7 Sales Executive Single -0.1438824 -0.2696525
## 9 Laboratory Technician Married -0.9452157 -1.0745696
## PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 1 -1.15935900 0 -1.3227615 2.5011973
## 3 -0.06678233 3 -0.8096420 -0.6088076
## 5 -0.88621483 2 -0.2965226 -0.6088076
## 6 -0.61307067 0 2.1407948 1.7236961
## 7 1.29893850 1 -0.8096420 -0.6088076
## 9 1.57208267 0 -0.1682427 -0.6088076
## YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager
## 1 -0.97857482 -0.6824716 -1.16099726
## 3 -0.33730174 -0.6824716 -0.32872313
## 5 -0.17698348 -0.6824716 -0.05129842
## 6 -0.01666521 1.4623824 0.78097571
## 7 -1.13889309 -0.6824716 -1.16099726
## 9 0.30397133 1.4623824 1.05840042
## EnvironmentSatisfaction JobSatisfaction WorkLifeBalance JobInvolvement
## 1 High Very High Good High
## 3 Medium Medium Bad High
## 5 Very High Low Better High
## 6 High Medium Good High
## 7 Low High Bad High
## 9 Medium Very High Better High
## PerformanceRating AvgHrs
## 1 Excellent -0.2538199
## 3 Excellent -0.5216240
## 5 Excellent 0.2222764
## 6 Excellent 2.2977583
## 7 Outstanding -0.5885750
## 9 Outstanding -0.3505269
head(dRF_test)
## Age Attrition BusinessTravel Department
## 2 -0.6503467 Left Travel-Frequently Research & Development
## 4 0.1145839 Stayed Non-Travel Research & Development
## 8 -0.8688983 Stayed Travel-Rarely Research & Development
## 14 1.0980661 Left Non-Travel Research & Development
## 17 -1.7431047 Stayed Travel-Rarely Research & Development
## 21 -1.1967257 Stayed Travel-Frequently Research & Development
## DistanceFromHome Education EducationField Gender JobLevel
## 2 0.09028845 Below College Life Sciences Female 1
## 4 -0.88928374 Doctor Life Sciences Male 3
## 8 1.06986064 Bachelor Life Sciences Male 2
## 14 -1.01173027 Below College Medical Male 1
## 17 -0.76683722 College Life Sciences Male 1
## 21 -1.01173027 Master Other Male 2
## JobRole MaritalStatus MonthlyIncome NumCompaniesWorked
## 2 Research Scientist Single -0.4891637 -1.0745696
## 4 Human Resources Married 0.3893476 0.1328061
## 8 Sales Executive Married -0.7115555 -0.2696525
## 14 Research Scientist Married -0.1547256 -0.6721110
## 17 Laboratory Technician Single -0.4840610 -0.6721110
## 21 Laboratory Technician Divorced 0.8413600 -0.6721110
## PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 2 2.1183710 1 -0.6813621 0.1686936
## 4 -1.1593590 3 0.2165969 1.7236961
## 8 1.8452268 3 -0.1682427 -0.6088076
## 14 -1.1593590 2 -0.1682427 0.9461948
## 17 -0.8862148 3 -1.0662017 0.1686936
## 21 0.7526502 0 -0.6813621 0.1686936
## YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager
## 2 -0.3373017 -0.3760639 -0.05129842
## 4 0.1436531 1.4623824 0.22612629
## 8 -1.1388931 -0.6824716 -1.16099726
## 14 0.4642896 2.0751978 1.33582514
## 17 -0.6579383 -0.3760639 -1.16099726
## 21 -0.1769835 -0.3760639 -0.05129842
## EnvironmentSatisfaction JobSatisfaction WorkLifeBalance JobInvolvement
## 2 High Medium Best Medium
## 4 Very High Very High Better Medium
## 8 Low Medium Better High
## 14 Low Medium Good Medium
## 17 Very High High Best Medium
## 21 High Medium Bad High
## PerformanceRating AvgHrs
## 2 Outstanding 0.006545263
## 4 Excellent -0.387721916
## 8 Outstanding -0.729916072
## 14 Excellent 1.256297831
## 17 Excellent -0.811745109
## 21 Excellent -0.090161781
Modify the performance metric function because Random Forest does not compute probabilities.
perf_met <- function(df) {
#Confusion Matrix Summary
cm <- suppressWarnings(confusionMatrix(data = as.factor(df$score),
reference = as.factor(df$Attrition),
positive = "Left"))
print(cm)
table <- data.frame(cm$table)
plotTable <- table %>%
mutate(Correctness = ifelse(table$Prediction == table$Reference, "Correct", "Incorrect")) %>%
group_by(Reference) %>%
mutate(Proportion = Freq/sum(Freq))
# Fill alpha relative to sensitivity/specificity by proportional outcomes within reference groups
ggplot(data = plotTable,
mapping = aes(x=Reference, y=Prediction, fill=Correctness, alpha=Proportion)) +
geom_tile() +
geom_text(aes(label=Freq), vjust=.5, fontface="bold", alpha=1) +
scale_fill_manual(values = c(Correct="#264d73", Incorrect="#b30000")) +
xlim(rev(levels(table$Reference))) +
ylim(levels(table$Prediction)) +
theme_light()
}
## Function to show which features are important.
feature_imp= function(mod) {
imp = varImp(mod)
plot <- ggplot(imp, aes(x=reorder(rownames(imp),Overall), y=Overall)) +
geom_point(color="skyblue", size=2, alpha=0.8) +
geom_segment(aes(x=rownames(imp), xend=rownames(imp), y=0, yend=Overall), color='skyblue') +
xlab('Variable') +
ylab('Overall Importance') +
theme_light() +
coord_flip()
print(plot)
}
rf_mod <- randomForest(Attrition ~ .,
data=dRF_train
)
print(rf_mod)
##
## Call:
## randomForest(formula = Attrition ~ ., data = dRF_train)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 5
##
## OOB estimate of error rate: 1.55%
## Confusion matrix:
## Stayed Left class.error
## Stayed 2585 5 0.001930502
## Left 43 455 0.086345382
dRF_test$score = predict(rf_mod, newdata = dRF_test)
dRF_test[1:10, c('Attrition','score')]
## Attrition score
## 2 Left Left
## 4 Stayed Stayed
## 8 Stayed Stayed
## 14 Left Left
## 17 Stayed Stayed
## 21 Stayed Stayed
## 24 Stayed Stayed
## 25 Stayed Stayed
## 27 Stayed Stayed
## 28 Stayed Stayed
perf_met(dRF_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 1109 12
## Left 0 201
##
## Accuracy : 0.9909
## 95% CI : (0.9842, 0.9953)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.9656
##
## Mcnemar's Test P-Value : 0.001496
##
## Sensitivity : 0.9437
## Specificity : 1.0000
## Pos Pred Value : 1.0000
## Neg Pred Value : 0.9893
## Prevalence : 0.1611
## Detection Rate : 0.1520
## Detection Prevalence : 0.1520
## Balanced Accuracy : 0.9718
##
## 'Positive' Class : Left
##
feature_imp(rf_mod)
Insights
In order to optimize both mtry and ntree hyperparameters in conjunction, I created a custom Random Forest method to be used with caret package’s train() function.
## Create custom training algorithm that tunes both mtry and ntree together.
customRF <- list(type = "Classification", library = "randomForest", loop = NULL)
customRF$parameters <- data.frame(parameter = c("mtry", "ntree"),
class = rep("numeric", 2),
label = c("mtry", "ntree"))
customRF$grid <- function(x, y, len = NULL, search = "grid") {}
customRF$fit <- function(x, y, wts, param, lev, last, weights, classProbs, ...){
randomForest(x, y, mtry = param$mtry, ntree=param$ntree, ...)}
customRF$predict <- function(modelFit, newdata, preProc = NULL, submodels = NULL){
predict(modelFit, newdata)}
customRF$prob <- function(modelFit, newdata, preProc = NULL, submodels = NULL){
predict(modelFit, newdata, type = "prob")}
customRF$sort <- function(x) {x[order(x[,1]),]}
customRF$levels <- function(x) {x$classes}
# Train model with customRF training algorithm.
weights = ifelse(dRF_train$Attrition == 'Left', 0.84, 0.16)
control <- trainControl(method = "repeatedcv",
number = 5,
repeats = 3,
search='grid',
returnResamp ="all",
savePredictions = TRUE,
classProbs = TRUE,
summaryFunction = twoClassSummary)
# Hyperparameter grid
tunegrid <- expand.grid(.mtry=c(5:12), .ntree=c(101,501,1001,2001))
set.seed(1955)
myRF_mod <- train(Attrition ~
Age +
BusinessTravel +
Department +
DistanceFromHome +
Education +
EducationField +
# Gender +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
PercentSalaryHike +
StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
# PerformanceRating +
AvgHrs,
data=dRF_train,
method=customRF,
metric= "Sens",
tuneGrid=tunegrid,
trControl=control)
plot(myRF_mod)
dRF_test$scores = predict(myRF_mod, newdata = dRF_test)
perf_met(dRF_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 1109 12
## Left 0 201
##
## Accuracy : 0.9909
## 95% CI : (0.9842, 0.9953)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.9656
##
## Mcnemar's Test P-Value : 0.001496
##
## Sensitivity : 0.9437
## Specificity : 1.0000
## Pos Pred Value : 1.0000
## Neg Pred Value : 0.9893
## Prevalence : 0.1611
## Detection Rate : 0.1520
## Detection Prevalence : 0.1520
## Balanced Accuracy : 0.9718
##
## 'Positive' Class : Left
##
myRF_mod
## 3088 samples
## 23 predictor
## 2 classes: 'Stayed', 'Left'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 3 times)
## Summary of sample sizes: 2470, 2470, 2471, 2471, 2470, 2471, ...
## Resampling results across tuning parameters:
##
## mtry ntree ROC Sens Spec
## 5 101 0.9831072 0.9989704 0.7884781
## 5 501 0.9858427 0.9988417 0.7925118
## 5 1001 0.9865139 0.9987130 0.7904848
## 5 2001 0.9861856 0.9988417 0.7898249
## 6 101 0.9814024 0.9983269 0.8011852
## 6 501 0.9849812 0.9985843 0.7924983
## 6 1001 0.9850839 0.9985843 0.7971987
## 6 2001 0.9857870 0.9988417 0.7965118
## 7 101 0.9840768 0.9976834 0.7985185
## 7 501 0.9853648 0.9980695 0.7945118
## 7 1001 0.9853906 0.9987130 0.7918384
## 7 2001 0.9853033 0.9983269 0.7951852
## 8 101 0.9825889 0.9975547 0.7991717
## 8 501 0.9845302 0.9979408 0.7991785
## 8 1001 0.9850600 0.9980695 0.7965185
## 8 2001 0.9849888 0.9979408 0.8005253
## 9 101 0.9838705 0.9980695 0.8058586
## 9 501 0.9831738 0.9979408 0.7971717
## 9 1001 0.9836694 0.9980695 0.8011987
## 9 2001 0.9837432 0.9976834 0.7978384
## 10 101 0.9838431 0.9974260 0.8038249
## 10 501 0.9833145 0.9979408 0.7971515
## 10 1001 0.9837660 0.9978121 0.8045320
## 10 2001 0.9836086 0.9979408 0.7971785
## 11 101 0.9801069 0.9978121 0.8052256
## 11 501 0.9833862 0.9975547 0.7998586
## 11 1001 0.9831305 0.9975547 0.8018653
## 11 2001 0.9834334 0.9979408 0.8031987
## 12 101 0.9815505 0.9972973 0.8058519
## 12 501 0.9824013 0.9974260 0.8045387
## 12 1001 0.9822733 0.9976834 0.8045387
## 12 2001 0.9827472 0.9978121 0.8052189
##
## Sens was used to select the optimal model using the largest value.
## The final values used for the model were mtry = 5 and ntree = 101.
Insights
The final model for comparison is the Neural Network algorithm.
Create a copy of the partitioned, scaled datasets as before.
dNN_train <- dt_train
dNN_test <- dt_test
head(dNN_train)
## Age Attrition BusinessTravel Department
## 1 1.5351693 Stayed Travel-Rarely Sales
## 3 -0.5410709 Stayed Travel-Frequently Research & Development
## 5 -0.5410709 Stayed Travel-Rarely Research & Development
## 6 0.9887903 Stayed Travel-Rarely Research & Development
## 7 -0.9781741 Left Travel-Rarely Research & Development
## 9 -0.6503467 Stayed Travel-Rarely Research & Development
## DistanceFromHome Education EducationField Gender JobLevel
## 1 -0.39949765 College Life Sciences Female 1
## 3 0.94741412 Master Other Male 4
## 5 0.09028845 Below College Medical Male 1
## 6 -0.15460460 Bachelor Life Sciences Female 4
## 7 0.21273497 College Medical Male 2
## 9 -1.01173027 Bachelor Life Sciences Male 3
## JobRole MaritalStatus MonthlyIncome NumCompaniesWorked
## 1 Healthcare Representative Married 1.4088205 -0.6721110
## 3 Sales Executive Married 2.7295641 -0.6721110
## 5 Sales Executive Single -0.8818574 0.5352647
## 6 Research Director Married -0.5142519 0.1328061
## 7 Sales Executive Single -0.1438824 -0.2696525
## 9 Laboratory Technician Married -0.9452157 -1.0745696
## PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 1 -1.15935900 0 -1.3227615 2.5011973
## 3 -0.06678233 3 -0.8096420 -0.6088076
## 5 -0.88621483 2 -0.2965226 -0.6088076
## 6 -0.61307067 0 2.1407948 1.7236961
## 7 1.29893850 1 -0.8096420 -0.6088076
## 9 1.57208267 0 -0.1682427 -0.6088076
## YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager
## 1 -0.97857482 -0.6824716 -1.16099726
## 3 -0.33730174 -0.6824716 -0.32872313
## 5 -0.17698348 -0.6824716 -0.05129842
## 6 -0.01666521 1.4623824 0.78097571
## 7 -1.13889309 -0.6824716 -1.16099726
## 9 0.30397133 1.4623824 1.05840042
## EnvironmentSatisfaction JobSatisfaction WorkLifeBalance JobInvolvement
## 1 High Very High Good High
## 3 Medium Medium Bad High
## 5 Very High Low Better High
## 6 High Medium Good High
## 7 Low High Bad High
## 9 Medium Very High Better High
## PerformanceRating AvgHrs
## 1 Excellent -0.2538199
## 3 Excellent -0.5216240
## 5 Excellent 0.2222764
## 6 Excellent 2.2977583
## 7 Outstanding -0.5885750
## 9 Outstanding -0.3505269
head(dNN_test)
## Age Attrition BusinessTravel Department
## 2 -0.6503467 Left Travel-Frequently Research & Development
## 4 0.1145839 Stayed Non-Travel Research & Development
## 8 -0.8688983 Stayed Travel-Rarely Research & Development
## 14 1.0980661 Left Non-Travel Research & Development
## 17 -1.7431047 Stayed Travel-Rarely Research & Development
## 21 -1.1967257 Stayed Travel-Frequently Research & Development
## DistanceFromHome Education EducationField Gender JobLevel
## 2 0.09028845 Below College Life Sciences Female 1
## 4 -0.88928374 Doctor Life Sciences Male 3
## 8 1.06986064 Bachelor Life Sciences Male 2
## 14 -1.01173027 Below College Medical Male 1
## 17 -0.76683722 College Life Sciences Male 1
## 21 -1.01173027 Master Other Male 2
## JobRole MaritalStatus MonthlyIncome NumCompaniesWorked
## 2 Research Scientist Single -0.4891637 -1.0745696
## 4 Human Resources Married 0.3893476 0.1328061
## 8 Sales Executive Married -0.7115555 -0.2696525
## 14 Research Scientist Married -0.1547256 -0.6721110
## 17 Laboratory Technician Single -0.4840610 -0.6721110
## 21 Laboratory Technician Divorced 0.8413600 -0.6721110
## PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 2 2.1183710 1 -0.6813621 0.1686936
## 4 -1.1593590 3 0.2165969 1.7236961
## 8 1.8452268 3 -0.1682427 -0.6088076
## 14 -1.1593590 2 -0.1682427 0.9461948
## 17 -0.8862148 3 -1.0662017 0.1686936
## 21 0.7526502 0 -0.6813621 0.1686936
## YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager
## 2 -0.3373017 -0.3760639 -0.05129842
## 4 0.1436531 1.4623824 0.22612629
## 8 -1.1388931 -0.6824716 -1.16099726
## 14 0.4642896 2.0751978 1.33582514
## 17 -0.6579383 -0.3760639 -1.16099726
## 21 -0.1769835 -0.3760639 -0.05129842
## EnvironmentSatisfaction JobSatisfaction WorkLifeBalance JobInvolvement
## 2 High Medium Best Medium
## 4 Very High Very High Better Medium
## 8 Low Medium Better High
## 14 Low Medium Good Medium
## 17 Very High High Best Medium
## 21 High Medium Bad High
## PerformanceRating AvgHrs
## 2 Outstanding 0.006545263
## 4 Excellent -0.387721916
## 8 Outstanding -0.729916072
## 14 Excellent 1.256297831
## 17 Excellent -0.811745109
## 21 Excellent -0.090161781
perf_met <- function(df) {
#Confusion Matrix Summary
cm <- suppressWarnings(confusionMatrix(data = as.factor(df$score),
reference = as.factor(df$Attrition),
positive = "Left"))
print(cm)
table <- data.frame(cm$table)
plotTable <- table %>%
mutate(Correctness = ifelse(table$Prediction == table$Reference, "Correct", "Incorrect")) %>%
group_by(Reference) %>%
mutate(Proportion = Freq/sum(Freq))
# Fill alpha relative to sensitivity/specificity by proportional outcomes within reference groups
ggplot(data = plotTable,
mapping = aes(x=Reference, y=Prediction, fill=Correctness, alpha=Proportion)) +
geom_tile() +
geom_text(aes(label=Freq), vjust=.5, fontface="bold", alpha=1) +
scale_fill_manual(values = c(Correct="#264d73", Incorrect="#b30000")) +
xlim(rev(levels(table$Reference))) +
ylim(levels(table$Prediction)) +
theme_light()
}
## Function to show which features are important.
feature_imp= function(mod) {
imp = varImp(mod)
plot <- ggplot(imp, aes(x=reorder(rownames(imp), Overall), y=Overall)) +
geom_point(color="skyblue", size=2, alpha=0.8) +
geom_segment(aes(x=rownames(imp), xend=rownames(imp), y=0, yend=Overall), color='skyblue') +
xlab('Variable') +
ylab('Overall Importance') +
theme_light() +
coord_flip()
print(plot)
}
nn_mod<- train(Attrition ~ .,
data = dNN_train,
method = "nnet" )
## # weights: 63
## initial value 2708.638435
## iter 10 value 1186.350693
## iter 20 value 1069.462031
## iter 30 value 1011.774638
## iter 40 value 974.843920
## iter 50 value 951.724334
## iter 60 value 946.526260
## iter 70 value 946.513051
## final value 946.512998
## converged
## # weights: 187
## initial value 2782.076469
## iter 10 value 1368.501277
## iter 20 value 1300.541478
## iter 30 value 1226.148970
## iter 40 value 1162.375646
## iter 50 value 1137.780134
## iter 60 value 1114.653380
## iter 70 value 1097.589681
## iter 80 value 1091.582789
## iter 90 value 1053.853154
## iter 100 value 986.440765
## final value 986.440765
## stopped after 100 iterations
## # weights: 311
## initial value 2200.118623
## iter 10 value 1148.316073
## iter 20 value 922.714514
## iter 30 value 778.738367
## iter 40 value 686.184812
## iter 50 value 657.418196
## iter 60 value 632.466211
## iter 70 value 606.036621
## iter 80 value 590.154980
## iter 90 value 568.327155
## iter 100 value 554.764469
## final value 554.764469
## stopped after 100 iterations
## # weights: 63
## initial value 2004.275011
## iter 10 value 1394.164128
## iter 20 value 1134.750602
## iter 30 value 1063.812723
## iter 40 value 1014.404238
## iter 50 value 996.506050
## iter 60 value 995.447432
## iter 70 value 995.296835
## iter 80 value 995.008574
## iter 90 value 995.004145
## iter 90 value 995.004137
## iter 90 value 995.004137
## final value 995.004137
## converged
## # weights: 187
## initial value 2462.425907
## iter 10 value 1309.391764
## iter 20 value 1140.039296
## iter 30 value 992.368101
## iter 40 value 921.235070
## iter 50 value 845.313146
## iter 60 value 791.541421
## iter 70 value 750.612356
## iter 80 value 727.056030
## iter 90 value 708.909453
## iter 100 value 684.765113
## final value 684.765113
## stopped after 100 iterations
## # weights: 311
## initial value 1729.845408
## iter 10 value 1001.099514
## iter 20 value 766.051939
## iter 30 value 639.928089
## iter 40 value 570.949963
## iter 50 value 532.144887
## iter 60 value 495.812011
## iter 70 value 467.134801
## iter 80 value 444.812820
## iter 90 value 423.002959
## iter 100 value 411.161307
## final value 411.161307
## stopped after 100 iterations
## # weights: 63
## initial value 2433.077205
## iter 10 value 1177.314677
## iter 20 value 1062.391261
## iter 30 value 1003.295077
## iter 40 value 978.800558
## iter 50 value 956.886696
## iter 60 value 953.034911
## iter 70 value 952.582156
## iter 80 value 952.468848
## iter 90 value 952.421102
## iter 100 value 952.383807
## final value 952.383807
## stopped after 100 iterations
## # weights: 187
## initial value 1910.774679
## iter 10 value 1115.584731
## iter 20 value 969.102553
## iter 30 value 863.230477
## iter 40 value 772.029252
## iter 50 value 723.767095
## iter 60 value 703.494652
## iter 70 value 654.547158
## iter 80 value 640.524410
## iter 90 value 630.745175
## iter 100 value 620.734482
## final value 620.734482
## stopped after 100 iterations
## # weights: 311
## initial value 1581.088647
## iter 10 value 1017.416944
## iter 20 value 786.555568
## iter 30 value 647.192697
## iter 40 value 575.982658
## iter 50 value 506.423710
## iter 60 value 488.712344
## iter 70 value 478.367101
## iter 80 value 469.356205
## iter 90 value 461.928856
## iter 100 value 451.963190
## final value 451.963190
## stopped after 100 iterations
## # weights: 63
## initial value 2883.478354
## iter 10 value 1231.371902
## iter 20 value 1057.120917
## iter 30 value 980.012457
## iter 40 value 952.979321
## iter 50 value 921.467947
## iter 60 value 907.627038
## iter 70 value 904.695012
## iter 80 value 904.626354
## iter 90 value 904.622981
## final value 904.622959
## converged
## # weights: 187
## initial value 1867.638300
## iter 10 value 1027.227422
## iter 20 value 837.787438
## iter 30 value 730.890526
## iter 40 value 680.605311
## iter 50 value 634.890225
## iter 60 value 596.172360
## iter 70 value 572.737051
## iter 80 value 556.000186
## iter 90 value 545.409751
## iter 100 value 533.423873
## final value 533.423873
## stopped after 100 iterations
## # weights: 311
## initial value 2867.687209
## iter 10 value 1136.452469
## iter 20 value 900.990361
## iter 30 value 755.173149
## iter 40 value 640.664429
## iter 50 value 536.715701
## iter 60 value 483.164199
## iter 70 value 421.579344
## iter 80 value 399.325258
## iter 90 value 375.931971
## iter 100 value 337.237344
## final value 337.237344
## stopped after 100 iterations
## # weights: 63
## initial value 2241.993147
## iter 10 value 1188.590211
## iter 20 value 1026.915140
## iter 30 value 983.354843
## iter 40 value 967.319621
## iter 50 value 955.354536
## iter 60 value 952.100538
## iter 70 value 943.803833
## iter 80 value 936.339944
## iter 90 value 936.161097
## iter 100 value 936.142762
## final value 936.142762
## stopped after 100 iterations
## # weights: 187
## initial value 2463.168753
## iter 10 value 1061.855396
## iter 20 value 915.503779
## iter 30 value 829.200653
## iter 40 value 792.951683
## iter 50 value 780.129796
## iter 60 value 756.888828
## iter 70 value 745.598946
## iter 80 value 739.261573
## iter 90 value 732.246762
## iter 100 value 721.685505
## final value 721.685505
## stopped after 100 iterations
## # weights: 311
## initial value 2177.278509
## iter 10 value 931.509551
## iter 20 value 774.825749
## iter 30 value 654.353347
## iter 40 value 571.325618
## iter 50 value 524.431190
## iter 60 value 496.287707
## iter 70 value 477.214997
## iter 80 value 456.403202
## iter 90 value 435.690141
## iter 100 value 423.990404
## final value 423.990404
## stopped after 100 iterations
## # weights: 63
## initial value 2374.062019
## iter 10 value 1192.877433
## iter 20 value 1043.828915
## iter 30 value 947.058252
## iter 40 value 919.595576
## iter 50 value 903.057983
## iter 60 value 895.536127
## iter 70 value 895.082416
## iter 80 value 894.979659
## iter 90 value 894.967909
## iter 100 value 894.960911
## final value 894.960911
## stopped after 100 iterations
## # weights: 187
## initial value 1928.390240
## iter 10 value 1156.671793
## iter 20 value 938.276979
## iter 30 value 831.711575
## iter 40 value 770.274505
## iter 50 value 735.297772
## iter 60 value 720.794018
## iter 70 value 708.127206
## iter 80 value 703.395027
## iter 90 value 702.341118
## iter 100 value 701.583357
## final value 701.583357
## stopped after 100 iterations
## # weights: 311
## initial value 2064.279324
## iter 10 value 963.909366
## iter 20 value 770.766838
## iter 30 value 640.588330
## iter 40 value 513.226816
## iter 50 value 443.572210
## iter 60 value 399.024061
## iter 70 value 371.887458
## iter 80 value 355.335606
## iter 90 value 346.212677
## iter 100 value 344.767369
## final value 344.767369
## stopped after 100 iterations
## # weights: 63
## initial value 2416.964133
## iter 10 value 1164.563357
## iter 20 value 1034.958791
## iter 30 value 949.251356
## iter 40 value 927.806565
## iter 50 value 910.655049
## iter 60 value 882.329011
## iter 70 value 878.421002
## iter 80 value 878.382865
## final value 878.382813
## converged
## # weights: 187
## initial value 1740.010867
## iter 10 value 1032.185123
## iter 20 value 912.967878
## iter 30 value 791.978590
## iter 40 value 685.107047
## iter 50 value 638.546534
## iter 60 value 613.613400
## iter 70 value 581.470245
## iter 80 value 567.910879
## iter 90 value 558.437890
## iter 100 value 553.365829
## final value 553.365829
## stopped after 100 iterations
## # weights: 311
## initial value 2414.148219
## iter 10 value 953.437653
## iter 20 value 730.726461
## iter 30 value 527.456369
## iter 40 value 456.586739
## iter 50 value 406.330166
## iter 60 value 380.694730
## iter 70 value 365.436287
## iter 80 value 347.419984
## iter 90 value 329.070197
## iter 100 value 323.913902
## final value 323.913902
## stopped after 100 iterations
## # weights: 63
## initial value 1862.878420
## iter 10 value 1131.148240
## iter 20 value 1063.447534
## iter 30 value 1025.165090
## iter 40 value 964.005117
## iter 50 value 942.722499
## iter 60 value 935.827508
## iter 70 value 933.181687
## iter 80 value 932.145863
## iter 90 value 927.895573
## iter 100 value 927.382314
## final value 927.382314
## stopped after 100 iterations
## # weights: 187
## initial value 2119.206487
## iter 10 value 1184.497226
## iter 20 value 1056.041819
## iter 30 value 975.591203
## iter 40 value 895.975976
## iter 50 value 858.146169
## iter 60 value 838.811861
## iter 70 value 826.482491
## iter 80 value 801.938039
## iter 90 value 756.629703
## iter 100 value 716.459491
## final value 716.459491
## stopped after 100 iterations
## # weights: 311
## initial value 3155.475897
## iter 10 value 1297.053450
## iter 20 value 1109.425516
## iter 30 value 907.413970
## iter 40 value 756.199944
## iter 50 value 645.366918
## iter 60 value 572.641379
## iter 70 value 516.883239
## iter 80 value 485.398869
## iter 90 value 457.932545
## iter 100 value 443.217197
## final value 443.217197
## stopped after 100 iterations
## # weights: 63
## initial value 2091.463400
## iter 10 value 1084.655013
## iter 20 value 1010.035095
## iter 30 value 966.620808
## iter 40 value 951.590263
## iter 50 value 945.641236
## iter 60 value 933.098458
## iter 70 value 915.498229
## iter 80 value 913.525086
## iter 90 value 913.290737
## iter 100 value 913.247352
## final value 913.247352
## stopped after 100 iterations
## # weights: 187
## initial value 1572.225000
## iter 10 value 985.255304
## iter 20 value 785.861035
## iter 30 value 667.040761
## iter 40 value 618.890778
## iter 50 value 585.944739
## iter 60 value 568.090433
## iter 70 value 554.899163
## iter 80 value 553.083452
## iter 90 value 552.389704
## iter 100 value 551.358821
## final value 551.358821
## stopped after 100 iterations
## # weights: 311
## initial value 2674.398377
## iter 10 value 1253.145212
## iter 20 value 1053.651091
## iter 30 value 885.366891
## iter 40 value 747.423364
## iter 50 value 602.366600
## iter 60 value 522.582887
## iter 70 value 468.331814
## iter 80 value 436.021219
## iter 90 value 423.717127
## iter 100 value 410.836123
## final value 410.836123
## stopped after 100 iterations
## # weights: 63
## initial value 2225.245702
## iter 10 value 1229.443787
## iter 20 value 1109.699855
## iter 30 value 1033.737594
## iter 40 value 997.257556
## iter 50 value 966.696392
## iter 60 value 955.427202
## iter 70 value 954.931470
## final value 954.928704
## converged
## # weights: 187
## initial value 2221.437657
## iter 10 value 1224.970098
## iter 20 value 942.079145
## iter 30 value 779.649463
## iter 40 value 644.407274
## iter 50 value 603.664094
## iter 60 value 583.789823
## iter 70 value 572.616257
## iter 80 value 567.795824
## iter 90 value 562.429750
## iter 100 value 555.199251
## final value 555.199251
## stopped after 100 iterations
## # weights: 311
## initial value 2385.578589
## iter 10 value 1177.061918
## iter 20 value 989.131619
## iter 30 value 770.690988
## iter 40 value 651.448056
## iter 50 value 576.716572
## iter 60 value 538.631350
## iter 70 value 520.964496
## iter 80 value 507.803527
## iter 90 value 504.281565
## iter 100 value 500.780617
## final value 500.780617
## stopped after 100 iterations
## # weights: 63
## initial value 1622.093809
## iter 10 value 1120.250297
## iter 20 value 1034.393880
## iter 30 value 1013.484898
## iter 40 value 1010.464378
## iter 50 value 1010.374914
## iter 60 value 1010.347280
## iter 70 value 1010.322837
## final value 1010.295600
## converged
## # weights: 187
## initial value 2147.819804
## iter 10 value 1250.367967
## iter 20 value 1079.860795
## iter 30 value 966.880845
## iter 40 value 867.573373
## iter 50 value 807.991459
## iter 60 value 769.402202
## iter 70 value 756.305842
## iter 80 value 748.694755
## iter 90 value 738.500380
## iter 100 value 718.143538
## final value 718.143538
## stopped after 100 iterations
## # weights: 311
## initial value 1454.291586
## iter 10 value 945.241836
## iter 20 value 741.095067
## iter 30 value 629.028878
## iter 40 value 564.995920
## iter 50 value 530.495423
## iter 60 value 497.419013
## iter 70 value 474.533943
## iter 80 value 460.201402
## iter 90 value 446.932410
## iter 100 value 436.346780
## final value 436.346780
## stopped after 100 iterations
## # weights: 63
## initial value 1950.654303
## iter 10 value 1205.787281
## iter 20 value 1065.761607
## iter 30 value 987.884797
## iter 40 value 968.857084
## iter 50 value 938.966071
## iter 60 value 912.036553
## iter 70 value 909.363317
## iter 80 value 908.547926
## iter 90 value 908.065125
## iter 100 value 907.525531
## final value 907.525531
## stopped after 100 iterations
## # weights: 187
## initial value 3472.181036
## iter 10 value 1389.369931
## iter 20 value 1356.530798
## iter 30 value 1231.029959
## iter 40 value 1151.786695
## iter 50 value 1122.948722
## iter 60 value 1114.096487
## iter 70 value 1100.492078
## iter 80 value 1093.109948
## iter 90 value 1089.937428
## iter 100 value 1088.118736
## final value 1088.118736
## stopped after 100 iterations
## # weights: 311
## initial value 2593.743516
## iter 10 value 1110.618596
## iter 20 value 889.252223
## iter 30 value 716.031672
## iter 40 value 575.446333
## iter 50 value 513.259779
## iter 60 value 462.422059
## iter 70 value 425.787983
## iter 80 value 395.871487
## iter 90 value 381.584571
## iter 100 value 377.720532
## final value 377.720532
## stopped after 100 iterations
## # weights: 63
## initial value 2649.389687
## iter 10 value 1153.459135
## iter 20 value 1014.819581
## iter 30 value 969.103846
## iter 40 value 952.207848
## iter 50 value 903.852894
## iter 60 value 881.828393
## iter 70 value 880.584721
## final value 880.580157
## converged
## # weights: 187
## initial value 1553.748538
## iter 10 value 972.949973
## iter 20 value 823.879959
## iter 30 value 703.719683
## iter 40 value 621.044726
## iter 50 value 597.155031
## iter 60 value 583.271504
## iter 70 value 577.858264
## iter 80 value 576.661098
## iter 90 value 574.949463
## iter 100 value 572.010711
## final value 572.010711
## stopped after 100 iterations
## # weights: 311
## initial value 2256.359123
## iter 10 value 1132.701774
## iter 20 value 822.382035
## iter 30 value 694.693188
## iter 40 value 584.934352
## iter 50 value 549.878182
## iter 60 value 531.752804
## iter 70 value 522.534858
## iter 80 value 513.822827
## iter 90 value 503.823472
## iter 100 value 500.366278
## final value 500.366278
## stopped after 100 iterations
## # weights: 63
## initial value 2258.189442
## iter 10 value 1188.820072
## iter 20 value 1078.055216
## iter 30 value 1031.837301
## iter 40 value 1014.451402
## iter 50 value 1006.901051
## iter 60 value 1006.088255
## iter 70 value 1004.852501
## iter 80 value 997.353020
## iter 90 value 985.127063
## iter 100 value 980.315080
## final value 980.315080
## stopped after 100 iterations
## # weights: 187
## initial value 4128.635738
## iter 10 value 1314.121281
## iter 20 value 1132.426131
## iter 30 value 1022.863954
## iter 40 value 942.842878
## iter 50 value 881.983774
## iter 60 value 814.373530
## iter 70 value 769.322292
## iter 80 value 731.176638
## iter 90 value 710.970086
## iter 100 value 697.804301
## final value 697.804301
## stopped after 100 iterations
## # weights: 311
## initial value 1908.586711
## iter 10 value 1167.972429
## iter 20 value 827.505461
## iter 30 value 724.030760
## iter 40 value 659.985786
## iter 50 value 612.673520
## iter 60 value 569.903504
## iter 70 value 543.196900
## iter 80 value 504.514334
## iter 90 value 477.955992
## iter 100 value 450.385340
## final value 450.385340
## stopped after 100 iterations
## # weights: 63
## initial value 3323.676314
## iter 10 value 1295.064976
## iter 20 value 1185.243952
## iter 30 value 1132.483679
## iter 40 value 1077.661244
## iter 50 value 1057.764155
## iter 60 value 1045.760768
## iter 70 value 1039.283336
## iter 80 value 1038.257659
## iter 90 value 1036.673802
## iter 100 value 1036.069514
## final value 1036.069514
## stopped after 100 iterations
## # weights: 187
## initial value 1671.332500
## iter 10 value 1055.784090
## iter 20 value 933.083278
## iter 30 value 867.137996
## iter 40 value 831.587249
## iter 50 value 815.521254
## iter 60 value 801.387440
## iter 70 value 755.482999
## iter 80 value 737.238420
## iter 90 value 732.971958
## iter 100 value 732.533407
## final value 732.533407
## stopped after 100 iterations
## # weights: 311
## initial value 1590.994023
## iter 10 value 913.755074
## iter 20 value 658.883306
## iter 30 value 525.476505
## iter 40 value 453.525163
## iter 50 value 430.257741
## iter 60 value 417.850052
## iter 70 value 407.388871
## iter 80 value 399.358777
## iter 90 value 387.364036
## iter 100 value 381.930052
## final value 381.930052
## stopped after 100 iterations
## # weights: 63
## initial value 2235.677053
## iter 10 value 1092.797462
## iter 20 value 1001.643632
## iter 30 value 929.404537
## iter 40 value 905.326497
## iter 50 value 875.336276
## iter 60 value 872.867691
## iter 70 value 872.751620
## final value 872.751370
## converged
## # weights: 187
## initial value 1656.588639
## iter 10 value 1027.922877
## iter 20 value 881.740630
## iter 30 value 797.242196
## iter 40 value 760.406095
## iter 50 value 732.186988
## iter 60 value 717.946448
## iter 70 value 715.723192
## iter 80 value 715.238591
## final value 715.066321
## converged
## # weights: 311
## initial value 2729.040179
## iter 10 value 1247.704674
## iter 20 value 959.767012
## iter 30 value 802.304571
## iter 40 value 648.629220
## iter 50 value 532.214752
## iter 60 value 453.621541
## iter 70 value 409.974901
## iter 80 value 389.215420
## iter 90 value 347.262743
## iter 100 value 335.857286
## final value 335.857286
## stopped after 100 iterations
## # weights: 63
## initial value 1932.656167
## iter 10 value 1199.626478
## iter 20 value 1047.635152
## iter 30 value 1017.155131
## iter 40 value 964.980563
## iter 50 value 953.117920
## iter 60 value 950.973155
## iter 70 value 950.234566
## iter 80 value 947.732235
## iter 90 value 947.527064
## final value 947.526115
## converged
## # weights: 187
## initial value 1898.536455
## iter 10 value 1012.701985
## iter 20 value 901.370113
## iter 30 value 864.135842
## iter 40 value 820.120162
## iter 50 value 768.377716
## iter 60 value 733.774669
## iter 70 value 714.579327
## iter 80 value 698.534610
## iter 90 value 692.948870
## iter 100 value 687.534365
## final value 687.534365
## stopped after 100 iterations
## # weights: 311
## initial value 2154.205521
## iter 10 value 1175.888799
## iter 20 value 1001.133381
## iter 30 value 841.884548
## iter 40 value 748.877055
## iter 50 value 665.868535
## iter 60 value 601.654439
## iter 70 value 563.554067
## iter 80 value 524.876525
## iter 90 value 491.756393
## iter 100 value 471.805510
## final value 471.805510
## stopped after 100 iterations
## # weights: 63
## initial value 1834.334378
## iter 10 value 1034.338591
## iter 20 value 945.220359
## iter 30 value 931.298517
## iter 40 value 915.952443
## iter 50 value 908.635134
## iter 60 value 908.334631
## iter 70 value 908.266724
## iter 80 value 908.195208
## iter 90 value 907.392051
## iter 100 value 907.169766
## final value 907.169766
## stopped after 100 iterations
## # weights: 187
## initial value 2889.319785
## iter 10 value 1225.871411
## iter 20 value 1019.027185
## iter 30 value 853.513384
## iter 40 value 736.749286
## iter 50 value 689.719702
## iter 60 value 669.016977
## iter 70 value 642.603445
## iter 80 value 625.697800
## iter 90 value 617.831603
## iter 100 value 614.980787
## final value 614.980787
## stopped after 100 iterations
## # weights: 311
## initial value 2727.065309
## iter 10 value 1146.966703
## iter 20 value 862.271607
## iter 30 value 745.859366
## iter 40 value 683.141052
## iter 50 value 648.654594
## iter 60 value 620.845441
## iter 70 value 591.430901
## iter 80 value 574.999142
## iter 90 value 560.570812
## iter 100 value 547.528369
## final value 547.528369
## stopped after 100 iterations
## # weights: 63
## initial value 1636.079191
## iter 10 value 1059.633419
## iter 20 value 939.961877
## iter 30 value 873.601374
## iter 40 value 849.763947
## iter 50 value 839.027635
## iter 60 value 796.069250
## iter 70 value 780.992457
## iter 80 value 780.957485
## final value 780.957450
## converged
## # weights: 187
## initial value 2828.449869
## iter 10 value 1175.583276
## iter 20 value 1010.725358
## iter 30 value 836.696881
## iter 40 value 769.564170
## iter 50 value 715.374482
## iter 60 value 698.080848
## iter 70 value 679.022542
## iter 80 value 669.745452
## iter 90 value 665.501383
## iter 100 value 664.244511
## final value 664.244511
## stopped after 100 iterations
## # weights: 311
## initial value 1437.676164
## iter 10 value 866.873486
## iter 20 value 612.891095
## iter 30 value 495.480303
## iter 40 value 421.466980
## iter 50 value 401.660004
## iter 60 value 384.523250
## iter 70 value 367.695909
## iter 80 value 355.603137
## iter 90 value 350.179899
## iter 100 value 349.415728
## final value 349.415728
## stopped after 100 iterations
## # weights: 63
## initial value 2434.491248
## iter 10 value 1174.115539
## iter 20 value 1078.336923
## iter 30 value 1015.256776
## iter 40 value 979.427033
## iter 50 value 941.320083
## iter 60 value 923.960090
## iter 70 value 923.124631
## iter 80 value 922.831914
## iter 90 value 922.691540
## iter 100 value 922.630892
## final value 922.630892
## stopped after 100 iterations
## # weights: 187
## initial value 1759.419994
## iter 10 value 984.864575
## iter 20 value 854.509407
## iter 30 value 799.253091
## iter 40 value 758.339506
## iter 50 value 726.316378
## iter 60 value 693.092735
## iter 70 value 667.216230
## iter 80 value 651.953040
## iter 90 value 639.497039
## iter 100 value 625.190698
## final value 625.190698
## stopped after 100 iterations
## # weights: 311
## initial value 2663.363302
## iter 10 value 1228.471396
## iter 20 value 1059.763670
## iter 30 value 929.375754
## iter 40 value 833.084918
## iter 50 value 774.747289
## iter 60 value 737.133560
## iter 70 value 695.011370
## iter 80 value 635.268639
## iter 90 value 590.659583
## iter 100 value 546.439071
## final value 546.439071
## stopped after 100 iterations
## # weights: 63
## initial value 1896.257715
## iter 10 value 1219.525400
## iter 20 value 1036.698945
## iter 30 value 955.255729
## iter 40 value 892.708595
## iter 50 value 875.155701
## iter 60 value 867.823326
## iter 70 value 865.188404
## iter 80 value 862.594006
## iter 90 value 856.330884
## iter 100 value 855.775388
## final value 855.775388
## stopped after 100 iterations
## # weights: 187
## initial value 2148.385846
## iter 10 value 1118.449892
## iter 20 value 972.487773
## iter 30 value 764.484651
## iter 40 value 685.026843
## iter 50 value 651.773281
## iter 60 value 625.669242
## iter 70 value 609.059105
## iter 80 value 603.195954
## iter 90 value 597.155351
## iter 100 value 590.256523
## final value 590.256523
## stopped after 100 iterations
## # weights: 311
## initial value 2346.937072
## iter 10 value 1119.942954
## iter 20 value 916.982434
## iter 30 value 774.543742
## iter 40 value 622.371553
## iter 50 value 522.150638
## iter 60 value 487.241136
## iter 70 value 469.174504
## iter 80 value 453.147538
## iter 90 value 440.318522
## iter 100 value 426.953238
## final value 426.953238
## stopped after 100 iterations
## # weights: 63
## initial value 2523.717385
## iter 10 value 1206.727871
## iter 20 value 1101.116474
## iter 30 value 1017.060241
## iter 40 value 987.263789
## iter 50 value 968.550110
## iter 60 value 952.560137
## iter 70 value 948.675615
## iter 80 value 948.655594
## iter 90 value 948.653335
## final value 948.653172
## converged
## # weights: 187
## initial value 3756.513855
## iter 10 value 1291.149472
## iter 20 value 1131.207972
## iter 30 value 961.887161
## iter 40 value 858.828529
## iter 50 value 830.391645
## iter 60 value 772.209587
## iter 70 value 741.599452
## iter 80 value 717.349865
## iter 90 value 701.003889
## iter 100 value 690.540051
## final value 690.540051
## stopped after 100 iterations
## # weights: 311
## initial value 3412.914485
## iter 10 value 1152.835895
## iter 20 value 945.789580
## iter 30 value 765.433309
## iter 40 value 613.057454
## iter 50 value 535.718780
## iter 60 value 484.040918
## iter 70 value 428.592270
## iter 80 value 394.366338
## iter 90 value 380.791388
## iter 100 value 368.472857
## final value 368.472857
## stopped after 100 iterations
## # weights: 63
## initial value 2656.442816
## iter 10 value 1172.069715
## iter 20 value 1041.046071
## iter 30 value 980.836611
## iter 40 value 972.203960
## iter 50 value 971.217736
## iter 60 value 970.897934
## iter 70 value 970.865804
## final value 970.865320
## converged
## # weights: 187
## initial value 2242.128424
## iter 10 value 1181.634409
## iter 20 value 1014.647817
## iter 30 value 917.474521
## iter 40 value 829.882306
## iter 50 value 785.980297
## iter 60 value 757.073419
## iter 70 value 730.032513
## iter 80 value 684.790901
## iter 90 value 651.207459
## iter 100 value 627.989089
## final value 627.989089
## stopped after 100 iterations
## # weights: 311
## initial value 3014.043943
## iter 10 value 1620.883858
## iter 20 value 1332.405510
## iter 30 value 1140.381160
## iter 40 value 958.239809
## iter 50 value 816.444947
## iter 60 value 733.566993
## iter 70 value 666.832976
## iter 80 value 624.933074
## iter 90 value 596.962187
## iter 100 value 549.333724
## final value 549.333724
## stopped after 100 iterations
## # weights: 63
## initial value 2175.675085
## iter 10 value 1036.212624
## iter 20 value 933.844914
## iter 30 value 894.031548
## iter 40 value 875.500549
## iter 50 value 834.470410
## iter 60 value 793.946806
## iter 70 value 790.270942
## iter 80 value 789.606522
## iter 90 value 789.383059
## iter 100 value 789.295777
## final value 789.295777
## stopped after 100 iterations
## # weights: 187
## initial value 3304.976485
## iter 10 value 1272.658657
## iter 20 value 1044.679129
## iter 30 value 921.536936
## iter 40 value 829.579116
## iter 50 value 746.256575
## iter 60 value 718.784581
## iter 70 value 701.257374
## iter 80 value 692.074076
## iter 90 value 688.266152
## iter 100 value 684.123252
## final value 684.123252
## stopped after 100 iterations
## # weights: 311
## initial value 1833.005405
## iter 10 value 1097.681912
## iter 20 value 831.960786
## iter 30 value 652.563214
## iter 40 value 557.357214
## iter 50 value 518.700403
## iter 60 value 494.523777
## iter 70 value 478.183661
## iter 80 value 469.933343
## iter 90 value 459.754883
## iter 100 value 457.954267
## final value 457.954267
## stopped after 100 iterations
## # weights: 63
## initial value 1836.349129
## iter 10 value 1154.434444
## iter 20 value 1028.755202
## iter 30 value 976.979204
## iter 40 value 941.196347
## iter 50 value 913.512255
## iter 60 value 875.603348
## iter 70 value 861.672355
## iter 80 value 861.613776
## final value 861.613627
## converged
## # weights: 187
## initial value 1784.781465
## iter 10 value 985.216478
## iter 20 value 794.393058
## iter 30 value 676.082834
## iter 40 value 607.796306
## iter 50 value 553.012326
## iter 60 value 532.504982
## iter 70 value 515.177028
## iter 80 value 493.002325
## iter 90 value 482.696151
## iter 100 value 481.849644
## final value 481.849644
## stopped after 100 iterations
## # weights: 311
## initial value 1840.899172
## iter 10 value 1128.207603
## iter 20 value 767.273108
## iter 30 value 610.280882
## iter 40 value 524.357529
## iter 50 value 471.919647
## iter 60 value 432.971016
## iter 70 value 417.469390
## iter 80 value 405.994730
## iter 90 value 401.988884
## iter 100 value 400.489378
## final value 400.489378
## stopped after 100 iterations
## # weights: 63
## initial value 1913.010460
## iter 10 value 1194.234209
## iter 20 value 1031.290625
## iter 30 value 991.724284
## iter 40 value 987.178453
## iter 50 value 982.051466
## iter 60 value 977.608334
## iter 70 value 973.649939
## iter 80 value 973.328546
## iter 90 value 973.310853
## final value 973.310756
## converged
## # weights: 187
## initial value 1915.629939
## iter 10 value 1178.918639
## iter 20 value 1026.474476
## iter 30 value 921.679130
## iter 40 value 867.721120
## iter 50 value 807.240180
## iter 60 value 779.024941
## iter 70 value 762.457929
## iter 80 value 745.558649
## iter 90 value 739.561736
## iter 100 value 734.276191
## final value 734.276191
## stopped after 100 iterations
## # weights: 311
## initial value 1573.167519
## iter 10 value 981.560965
## iter 20 value 755.022502
## iter 30 value 627.752738
## iter 40 value 570.603495
## iter 50 value 510.126960
## iter 60 value 456.443837
## iter 70 value 427.893725
## iter 80 value 414.090827
## iter 90 value 402.426615
## iter 100 value 396.632617
## final value 396.632617
## stopped after 100 iterations
## # weights: 63
## initial value 2121.229454
## iter 10 value 1247.526780
## iter 20 value 1087.492144
## iter 30 value 1011.694183
## iter 40 value 992.956406
## iter 50 value 980.290324
## iter 60 value 966.777869
## iter 70 value 950.787649
## iter 80 value 946.752987
## iter 90 value 945.894783
## iter 100 value 944.634036
## final value 944.634036
## stopped after 100 iterations
## # weights: 187
## initial value 1928.624826
## iter 10 value 1119.765479
## iter 20 value 931.060512
## iter 30 value 826.309629
## iter 40 value 774.139525
## iter 50 value 738.287897
## iter 60 value 712.242637
## iter 70 value 695.622411
## iter 80 value 689.777214
## iter 90 value 685.207329
## iter 100 value 682.991948
## final value 682.991948
## stopped after 100 iterations
## # weights: 311
## initial value 2240.880828
## iter 10 value 1125.175252
## iter 20 value 960.530471
## iter 30 value 781.080861
## iter 40 value 665.112610
## iter 50 value 541.940907
## iter 60 value 463.292867
## iter 70 value 426.705185
## iter 80 value 399.237653
## iter 90 value 386.274195
## iter 100 value 380.633878
## final value 380.633878
## stopped after 100 iterations
## # weights: 63
## initial value 2932.581728
## iter 10 value 1365.469830
## iter 20 value 1276.239477
## iter 30 value 1122.191691
## iter 40 value 1059.318556
## iter 50 value 1032.974523
## iter 60 value 1017.841039
## iter 70 value 1010.400342
## iter 80 value 1002.824739
## iter 90 value 994.425793
## iter 100 value 987.380014
## final value 987.380014
## stopped after 100 iterations
## # weights: 187
## initial value 1486.153561
## iter 10 value 995.242068
## iter 20 value 851.131950
## iter 30 value 758.292406
## iter 40 value 694.131868
## iter 50 value 680.501967
## iter 60 value 662.620555
## iter 70 value 636.670970
## iter 80 value 621.041551
## iter 90 value 617.919830
## iter 100 value 616.595172
## final value 616.595172
## stopped after 100 iterations
## # weights: 311
## initial value 3711.752857
## iter 10 value 1359.761393
## iter 20 value 1298.561452
## iter 30 value 1207.592233
## iter 40 value 1157.839743
## iter 50 value 1064.698485
## iter 60 value 1019.577487
## iter 70 value 1008.229113
## iter 80 value 996.363339
## iter 90 value 985.139893
## iter 100 value 966.910732
## final value 966.910732
## stopped after 100 iterations
## # weights: 63
## initial value 2234.160464
## iter 10 value 1142.828059
## iter 20 value 1022.282510
## iter 30 value 999.286686
## iter 40 value 993.189216
## iter 50 value 991.648022
## iter 60 value 991.619938
## final value 991.618852
## converged
## # weights: 187
## initial value 1778.537453
## iter 10 value 1088.834546
## iter 20 value 931.154946
## iter 30 value 884.762019
## iter 40 value 835.220025
## iter 50 value 784.790909
## iter 60 value 763.106901
## iter 70 value 754.886057
## iter 80 value 740.309569
## iter 90 value 731.735121
## iter 100 value 718.417129
## final value 718.417129
## stopped after 100 iterations
## # weights: 311
## initial value 2220.773437
## iter 10 value 1176.416613
## iter 20 value 995.572551
## iter 30 value 834.537517
## iter 40 value 715.215599
## iter 50 value 647.689540
## iter 60 value 598.019702
## iter 70 value 541.391079
## iter 80 value 496.480282
## iter 90 value 469.316455
## iter 100 value 453.385248
## final value 453.385248
## stopped after 100 iterations
## # weights: 63
## initial value 2743.540073
## iter 10 value 1170.900252
## iter 20 value 1046.682361
## iter 30 value 960.949082
## iter 40 value 938.576237
## iter 50 value 923.543119
## iter 60 value 904.078122
## iter 70 value 902.391658
## iter 80 value 901.812666
## iter 90 value 901.529625
## iter 100 value 900.928860
## final value 900.928860
## stopped after 100 iterations
## # weights: 187
## initial value 1903.552738
## iter 10 value 1011.713204
## iter 20 value 817.838513
## iter 30 value 706.975807
## iter 40 value 639.131463
## iter 50 value 587.816985
## iter 60 value 570.240525
## iter 70 value 552.199580
## iter 80 value 539.795568
## iter 90 value 533.973363
## iter 100 value 526.191920
## final value 526.191920
## stopped after 100 iterations
## # weights: 311
## initial value 4325.350441
## iter 10 value 1352.844203
## iter 20 value 1086.455908
## iter 30 value 946.990879
## iter 40 value 788.480792
## iter 50 value 666.424163
## iter 60 value 610.393300
## iter 70 value 581.372288
## iter 80 value 548.255284
## iter 90 value 528.137284
## iter 100 value 524.046820
## final value 524.046820
## stopped after 100 iterations
## # weights: 63
## initial value 1754.277525
## iter 10 value 1132.314187
## iter 20 value 1012.101999
## iter 30 value 955.271066
## iter 40 value 940.172116
## iter 50 value 917.097627
## iter 60 value 881.435856
## iter 70 value 880.369919
## iter 80 value 880.355772
## final value 880.355756
## converged
## # weights: 187
## initial value 2485.103074
## iter 10 value 1307.643639
## iter 20 value 1246.566601
## iter 30 value 1177.468611
## iter 40 value 1120.300478
## iter 50 value 1076.138720
## iter 60 value 1048.058532
## iter 70 value 1006.894584
## iter 80 value 993.874278
## iter 90 value 986.302816
## iter 100 value 983.550744
## final value 983.550744
## stopped after 100 iterations
## # weights: 311
## initial value 2709.304588
## iter 10 value 1277.042583
## iter 20 value 964.529897
## iter 30 value 713.366885
## iter 40 value 584.047875
## iter 50 value 534.003898
## iter 60 value 489.325008
## iter 70 value 468.958187
## iter 80 value 459.472113
## iter 90 value 442.092133
## iter 100 value 435.993793
## final value 435.993793
## stopped after 100 iterations
## # weights: 63
## initial value 1766.330644
## iter 10 value 1119.860265
## iter 20 value 1017.516347
## iter 30 value 980.435255
## iter 40 value 973.891105
## iter 50 value 967.856928
## iter 60 value 955.637791
## iter 70 value 954.900989
## iter 80 value 954.828000
## final value 954.827125
## converged
## # weights: 187
## initial value 1408.273143
## iter 10 value 1007.247522
## iter 20 value 912.558922
## iter 30 value 869.981112
## iter 40 value 834.473951
## iter 50 value 801.334095
## iter 60 value 772.771769
## iter 70 value 736.623987
## iter 80 value 719.194244
## iter 90 value 716.058823
## iter 100 value 715.381053
## final value 715.381053
## stopped after 100 iterations
## # weights: 311
## initial value 2425.706669
## iter 10 value 1266.121697
## iter 20 value 1084.028530
## iter 30 value 927.508307
## iter 40 value 810.578519
## iter 50 value 670.801979
## iter 60 value 596.090362
## iter 70 value 544.655016
## iter 80 value 501.183864
## iter 90 value 466.743495
## iter 100 value 449.662770
## final value 449.662770
## stopped after 100 iterations
## # weights: 63
## initial value 1968.358214
## iter 10 value 1308.414048
## iter 20 value 1217.824249
## iter 30 value 1152.695534
## iter 40 value 1105.149559
## iter 50 value 1085.083269
## iter 60 value 1073.529658
## iter 70 value 1070.384854
## iter 80 value 1068.468591
## iter 90 value 1067.966114
## iter 100 value 1067.636936
## final value 1067.636936
## stopped after 100 iterations
## # weights: 187
## initial value 1717.902998
## iter 10 value 992.532720
## iter 20 value 866.757009
## iter 30 value 790.066250
## iter 40 value 726.293581
## iter 50 value 693.725456
## iter 60 value 677.338276
## iter 70 value 665.697840
## iter 80 value 662.985476
## iter 90 value 661.148481
## iter 100 value 659.096840
## final value 659.096840
## stopped after 100 iterations
## # weights: 311
## initial value 1661.030037
## iter 10 value 955.222461
## iter 20 value 648.400979
## iter 30 value 532.543656
## iter 40 value 469.543539
## iter 50 value 429.746740
## iter 60 value 405.743738
## iter 70 value 386.956738
## iter 80 value 378.061040
## iter 90 value 364.167322
## iter 100 value 353.279312
## final value 353.279312
## stopped after 100 iterations
## # weights: 63
## initial value 2714.387852
## iter 10 value 1315.442153
## iter 20 value 1313.493549
## iter 30 value 1296.438895
## iter 40 value 1290.616507
## iter 50 value 1290.534298
## iter 60 value 1290.530151
## final value 1290.529203
## converged
## # weights: 187
## initial value 3849.357408
## iter 10 value 1228.013988
## iter 20 value 1037.723080
## iter 30 value 930.772614
## iter 40 value 826.486247
## iter 50 value 797.660273
## iter 60 value 756.064151
## iter 70 value 717.375269
## iter 80 value 703.044706
## iter 90 value 683.256663
## iter 100 value 671.478514
## final value 671.478514
## stopped after 100 iterations
## # weights: 311
## initial value 2397.725992
## iter 10 value 1178.787921
## iter 20 value 1004.360232
## iter 30 value 847.353201
## iter 40 value 657.753451
## iter 50 value 571.662692
## iter 60 value 520.773224
## iter 70 value 481.973645
## iter 80 value 457.345240
## iter 90 value 441.685167
## iter 100 value 433.101441
## final value 433.101441
## stopped after 100 iterations
## # weights: 63
## initial value 1800.098218
## iter 10 value 1065.276944
## iter 20 value 981.664560
## iter 30 value 949.714117
## iter 40 value 938.411971
## iter 50 value 932.558303
## iter 60 value 931.493065
## iter 70 value 928.590845
## iter 80 value 925.941187
## iter 90 value 925.736654
## final value 925.735449
## converged
## # weights: 187
## initial value 1813.958246
## iter 10 value 1027.237644
## iter 20 value 869.619130
## iter 30 value 805.749177
## iter 40 value 778.293803
## iter 50 value 759.170699
## iter 60 value 744.578603
## iter 70 value 737.330301
## iter 80 value 724.790957
## iter 90 value 700.118487
## iter 100 value 675.910625
## final value 675.910625
## stopped after 100 iterations
## # weights: 311
## initial value 2747.318900
## iter 10 value 1196.057451
## iter 20 value 1007.432622
## iter 30 value 882.656883
## iter 40 value 793.096745
## iter 50 value 718.026760
## iter 60 value 647.901926
## iter 70 value 608.714483
## iter 80 value 588.937619
## iter 90 value 576.958416
## iter 100 value 565.346715
## final value 565.346715
## stopped after 100 iterations
## # weights: 63
## initial value 2136.794835
## iter 10 value 1146.398047
## iter 20 value 1033.848232
## iter 30 value 974.858672
## iter 40 value 949.274105
## iter 50 value 931.584146
## iter 60 value 909.233858
## iter 70 value 907.695837
## iter 80 value 907.484860
## iter 90 value 907.317972
## iter 100 value 907.204820
## final value 907.204820
## stopped after 100 iterations
## # weights: 187
## initial value 2854.546617
## iter 10 value 1077.877586
## iter 20 value 911.192394
## iter 30 value 799.390594
## iter 40 value 721.933101
## iter 50 value 663.958628
## iter 60 value 640.235111
## iter 70 value 608.727570
## iter 80 value 598.471414
## iter 90 value 584.579601
## iter 100 value 576.468977
## final value 576.468977
## stopped after 100 iterations
## # weights: 311
## initial value 3427.415484
## iter 10 value 1307.407646
## iter 20 value 1243.678442
## iter 30 value 1193.740721
## iter 40 value 1164.234227
## iter 50 value 1127.911428
## iter 60 value 1083.470810
## iter 70 value 1054.059140
## iter 80 value 1021.664811
## iter 90 value 1015.310274
## iter 100 value 1007.905692
## final value 1007.905692
## stopped after 100 iterations
## # weights: 63
## initial value 3383.376796
## iter 10 value 1345.276729
## iter 20 value 1300.168880
## iter 30 value 1194.642741
## iter 40 value 1134.449723
## iter 50 value 1107.503529
## iter 60 value 1077.695311
## iter 70 value 1071.355559
## iter 80 value 1068.267254
## iter 90 value 1066.446914
## iter 100 value 1065.615205
## final value 1065.615205
## stopped after 100 iterations
## # weights: 187
## initial value 2029.661197
## iter 10 value 1081.762901
## iter 20 value 927.632827
## iter 30 value 805.516547
## iter 40 value 712.511264
## iter 50 value 671.139174
## iter 60 value 647.611142
## iter 70 value 629.082089
## iter 80 value 616.890728
## iter 90 value 606.795822
## iter 100 value 598.907570
## final value 598.907570
## stopped after 100 iterations
## # weights: 311
## initial value 2422.250373
## iter 10 value 1094.289596
## iter 20 value 920.936921
## iter 30 value 760.339151
## iter 40 value 650.246251
## iter 50 value 559.113333
## iter 60 value 518.542820
## iter 70 value 488.564706
## iter 80 value 462.834197
## iter 90 value 445.102080
## iter 100 value 436.804607
## final value 436.804607
## stopped after 100 iterations
## # weights: 63
## initial value 1577.572285
## iter 10 value 1198.785539
## iter 20 value 1132.400553
## iter 30 value 1116.847205
## iter 40 value 1101.763925
## iter 50 value 1095.588742
## iter 60 value 1039.350657
## iter 70 value 977.091467
## iter 80 value 956.538441
## iter 90 value 955.012365
## iter 100 value 954.945619
## final value 954.945619
## stopped after 100 iterations
## # weights: 187
## initial value 1674.881852
## iter 10 value 1031.573809
## iter 20 value 872.495157
## iter 30 value 820.422821
## iter 40 value 788.533444
## iter 50 value 758.711444
## iter 60 value 721.025965
## iter 70 value 689.156675
## iter 80 value 666.791812
## iter 90 value 639.580818
## iter 100 value 620.646015
## final value 620.646015
## stopped after 100 iterations
## # weights: 311
## initial value 2555.903847
## iter 10 value 1308.904249
## iter 20 value 1048.244252
## iter 30 value 891.505784
## iter 40 value 815.036183
## iter 50 value 777.475934
## iter 60 value 677.668684
## iter 70 value 616.893522
## iter 80 value 573.039006
## iter 90 value 540.655331
## iter 100 value 512.695080
## final value 512.695080
## stopped after 100 iterations
## # weights: 63
## initial value 3283.025944
## iter 10 value 1328.592392
## iter 20 value 1164.156251
## iter 30 value 1115.090754
## iter 40 value 1078.954337
## iter 50 value 1056.582597
## iter 60 value 1026.486122
## iter 70 value 1007.157239
## iter 80 value 1001.340176
## iter 90 value 984.197580
## iter 100 value 974.129963
## final value 974.129963
## stopped after 100 iterations
## # weights: 187
## initial value 2263.427622
## iter 10 value 1355.699634
## iter 20 value 1182.243563
## iter 30 value 945.647850
## iter 40 value 867.622227
## iter 50 value 829.670591
## iter 60 value 806.488205
## iter 70 value 797.012232
## iter 80 value 788.396233
## iter 90 value 784.905656
## iter 100 value 780.681420
## final value 780.681420
## stopped after 100 iterations
## # weights: 311
## initial value 1589.441434
## iter 10 value 1031.986803
## iter 20 value 744.108023
## iter 30 value 644.053670
## iter 40 value 563.149442
## iter 50 value 483.715371
## iter 60 value 434.578638
## iter 70 value 408.367711
## iter 80 value 375.957515
## iter 90 value 362.905541
## iter 100 value 351.926566
## final value 351.926566
## stopped after 100 iterations
## # weights: 63
## initial value 2529.539263
## iter 10 value 1134.251782
## iter 20 value 1037.074577
## iter 30 value 1014.120172
## iter 40 value 991.297491
## iter 50 value 961.644500
## iter 60 value 937.274705
## iter 70 value 937.107436
## final value 937.106141
## converged
## # weights: 187
## initial value 2559.787401
## iter 10 value 1297.140286
## iter 20 value 1143.732171
## iter 30 value 980.146176
## iter 40 value 857.954247
## iter 50 value 793.130902
## iter 60 value 755.597075
## iter 70 value 727.940746
## iter 80 value 690.901960
## iter 90 value 671.276626
## iter 100 value 651.985236
## final value 651.985236
## stopped after 100 iterations
## # weights: 311
## initial value 3117.697165
## iter 10 value 1365.023561
## iter 20 value 1272.683998
## iter 30 value 1198.810553
## iter 40 value 1160.070718
## iter 50 value 1145.642212
## iter 60 value 1138.558994
## iter 70 value 1131.556107
## iter 80 value 1127.904974
## iter 90 value 1121.969313
## iter 100 value 1111.494485
## final value 1111.494485
## stopped after 100 iterations
## # weights: 63
## initial value 1556.722768
## iter 10 value 1138.330553
## iter 20 value 1046.589165
## iter 30 value 1026.277868
## iter 40 value 1017.625726
## iter 50 value 1015.990954
## iter 60 value 1015.426726
## iter 70 value 1015.418645
## final value 1015.418610
## converged
## # weights: 187
## initial value 3233.699428
## iter 10 value 1349.985727
## iter 20 value 1219.304023
## iter 30 value 1064.733674
## iter 40 value 972.544342
## iter 50 value 901.201412
## iter 60 value 853.422880
## iter 70 value 819.724130
## iter 80 value 794.272660
## iter 90 value 770.723301
## iter 100 value 755.292334
## final value 755.292334
## stopped after 100 iterations
## # weights: 311
## initial value 3868.682905
## iter 10 value 1315.788557
## iter 20 value 1186.902221
## iter 30 value 1090.066560
## iter 40 value 1020.174774
## iter 50 value 969.281293
## iter 60 value 943.637771
## iter 70 value 901.708424
## iter 80 value 841.484181
## iter 90 value 751.534065
## iter 100 value 686.286113
## final value 686.286113
## stopped after 100 iterations
## # weights: 63
## initial value 2608.284859
## iter 10 value 1234.227798
## iter 20 value 1078.450026
## iter 30 value 1002.915905
## iter 40 value 973.679779
## iter 50 value 951.961269
## iter 60 value 925.680719
## iter 70 value 922.410038
## iter 80 value 921.374317
## iter 90 value 920.684054
## iter 100 value 920.565155
## final value 920.565155
## stopped after 100 iterations
## # weights: 187
## initial value 2518.331288
## iter 10 value 1116.551728
## iter 20 value 881.101784
## iter 30 value 740.499797
## iter 40 value 691.328076
## iter 50 value 643.626376
## iter 60 value 617.611942
## iter 70 value 602.284671
## iter 80 value 591.412444
## iter 90 value 589.307323
## iter 100 value 588.515404
## final value 588.515404
## stopped after 100 iterations
## # weights: 311
## initial value 3052.162069
## iter 10 value 1361.786388
## iter 20 value 1176.910698
## iter 30 value 1016.512352
## iter 40 value 817.960418
## iter 50 value 659.144344
## iter 60 value 592.442464
## iter 70 value 550.896164
## iter 80 value 529.710532
## iter 90 value 498.877650
## iter 100 value 474.004920
## final value 474.004920
## stopped after 100 iterations
## # weights: 63
## initial value 1899.636084
## iter 10 value 1036.324031
## iter 20 value 949.627243
## iter 30 value 906.060246
## iter 40 value 885.023549
## iter 50 value 868.077492
## iter 60 value 854.896772
## iter 70 value 853.660755
## iter 80 value 853.503962
## iter 90 value 853.462484
## iter 100 value 853.422882
## final value 853.422882
## stopped after 100 iterations
## # weights: 187
## initial value 1739.222098
## iter 10 value 984.109927
## iter 20 value 798.463186
## iter 30 value 732.617305
## iter 40 value 674.308693
## iter 50 value 633.708848
## iter 60 value 611.188451
## iter 70 value 595.276752
## iter 80 value 576.329586
## iter 90 value 575.142496
## iter 100 value 574.807640
## final value 574.807640
## stopped after 100 iterations
## # weights: 311
## initial value 1525.533752
## iter 10 value 876.367123
## iter 20 value 644.598447
## iter 30 value 505.781980
## iter 40 value 452.160907
## iter 50 value 407.012779
## iter 60 value 385.467274
## iter 70 value 379.171017
## iter 80 value 375.314192
## iter 90 value 371.296787
## iter 100 value 363.857457
## final value 363.857457
## stopped after 100 iterations
## # weights: 63
## initial value 2472.202434
## iter 10 value 1136.726793
## iter 20 value 1058.886541
## iter 30 value 1037.300970
## iter 40 value 1015.540107
## iter 50 value 995.574267
## iter 60 value 970.351794
## iter 70 value 945.810520
## iter 80 value 935.634410
## iter 90 value 929.342508
## iter 100 value 928.151748
## final value 928.151748
## stopped after 100 iterations
## # weights: 187
## initial value 2862.651924
## iter 10 value 1299.174235
## iter 20 value 1117.162341
## iter 30 value 1019.689721
## iter 40 value 949.458240
## iter 50 value 930.590853
## iter 60 value 892.321187
## iter 70 value 794.502517
## iter 80 value 716.656799
## iter 90 value 685.831947
## iter 100 value 662.816633
## final value 662.816633
## stopped after 100 iterations
## # weights: 311
## initial value 2928.129963
## iter 10 value 1682.443606
## iter 20 value 1232.417212
## iter 30 value 962.607831
## iter 40 value 797.560769
## iter 50 value 696.820756
## iter 60 value 621.152133
## iter 70 value 583.928074
## iter 80 value 524.384659
## iter 90 value 495.957486
## iter 100 value 450.427832
## final value 450.427832
## stopped after 100 iterations
## # weights: 63
## initial value 1770.286310
## iter 10 value 1101.162421
## iter 20 value 960.843482
## iter 30 value 928.983828
## iter 40 value 906.936558
## iter 50 value 890.654883
## iter 60 value 886.500436
## iter 70 value 883.108037
## iter 80 value 882.463408
## iter 90 value 882.031413
## iter 100 value 881.605834
## final value 881.605834
## stopped after 100 iterations
## # weights: 187
## initial value 2067.371823
## iter 10 value 1291.545455
## iter 20 value 1157.765135
## iter 30 value 1064.583904
## iter 40 value 1007.406845
## iter 50 value 981.923592
## iter 60 value 958.621443
## iter 70 value 924.190660
## iter 80 value 899.884786
## iter 90 value 880.927630
## iter 100 value 857.441396
## final value 857.441396
## stopped after 100 iterations
## # weights: 311
## initial value 3347.105160
## iter 10 value 1192.830909
## iter 20 value 956.351540
## iter 30 value 821.708011
## iter 40 value 707.395586
## iter 50 value 643.206519
## iter 60 value 567.318699
## iter 70 value 533.068684
## iter 80 value 515.795686
## iter 90 value 496.066276
## iter 100 value 486.038465
## final value 486.038465
## stopped after 100 iterations
## # weights: 63
## initial value 3102.311649
## final value 1436.011228
## converged
## # weights: 187
## initial value 1497.004636
## iter 10 value 1092.773648
## iter 20 value 969.596087
## iter 30 value 865.666841
## iter 40 value 752.093616
## iter 50 value 709.806576
## iter 60 value 681.450673
## iter 70 value 660.883270
## iter 80 value 645.911386
## iter 90 value 645.201044
## iter 100 value 645.190086
## final value 645.190086
## stopped after 100 iterations
## # weights: 311
## initial value 2335.197361
## iter 10 value 1030.757951
## iter 20 value 749.624722
## iter 30 value 533.708191
## iter 40 value 441.667561
## iter 50 value 402.899809
## iter 60 value 374.289962
## iter 70 value 357.347297
## iter 80 value 338.868189
## iter 90 value 325.534785
## iter 100 value 315.748515
## final value 315.748515
## stopped after 100 iterations
## # weights: 63
## initial value 1596.606826
## iter 10 value 1139.764903
## iter 20 value 1070.835783
## iter 30 value 1040.742377
## iter 40 value 1026.765741
## iter 50 value 1025.352351
## iter 60 value 1024.876820
## iter 70 value 1024.865278
## final value 1024.864798
## converged
## # weights: 187
## initial value 2020.771259
## iter 10 value 1063.683206
## iter 20 value 961.251369
## iter 30 value 886.510415
## iter 40 value 838.646332
## iter 50 value 812.040584
## iter 60 value 789.006645
## iter 70 value 764.384341
## iter 80 value 744.965112
## iter 90 value 735.899952
## iter 100 value 730.304757
## final value 730.304757
## stopped after 100 iterations
## # weights: 311
## initial value 1914.175835
## iter 10 value 990.901853
## iter 20 value 794.647576
## iter 30 value 675.483532
## iter 40 value 620.263285
## iter 50 value 595.839288
## iter 60 value 575.870148
## iter 70 value 554.026939
## iter 80 value 536.716036
## iter 90 value 523.821286
## iter 100 value 508.041712
## final value 508.041712
## stopped after 100 iterations
## # weights: 63
## initial value 1802.454274
## iter 10 value 1167.698725
## iter 20 value 1038.274865
## iter 30 value 1009.238110
## iter 40 value 997.040877
## iter 50 value 972.507209
## iter 60 value 959.009336
## iter 70 value 958.013201
## iter 80 value 956.544935
## iter 90 value 956.114850
## iter 100 value 955.641293
## final value 955.641293
## stopped after 100 iterations
## # weights: 187
## initial value 2756.657631
## iter 10 value 1189.149479
## iter 20 value 1047.663677
## iter 30 value 984.862802
## iter 40 value 939.573283
## iter 50 value 906.368445
## iter 60 value 887.076003
## iter 70 value 879.833740
## iter 80 value 877.827854
## iter 90 value 875.441951
## iter 100 value 873.523817
## final value 873.523817
## stopped after 100 iterations
## # weights: 311
## initial value 2881.355597
## iter 10 value 1172.586231
## iter 20 value 901.447850
## iter 30 value 696.650050
## iter 40 value 555.161749
## iter 50 value 492.534502
## iter 60 value 456.160350
## iter 70 value 441.404881
## iter 80 value 430.888635
## iter 90 value 424.115781
## iter 100 value 419.097993
## final value 419.097993
## stopped after 100 iterations
## # weights: 63
## initial value 2009.923788
## iter 10 value 1190.521417
## iter 20 value 1050.849721
## iter 30 value 1003.678960
## iter 40 value 974.485055
## iter 50 value 952.372387
## iter 60 value 940.459147
## iter 70 value 940.195302
## iter 80 value 940.186378
## final value 940.186349
## converged
## # weights: 187
## initial value 3290.921948
## iter 10 value 1381.055005
## iter 20 value 1379.653393
## final value 1379.645213
## converged
## # weights: 311
## initial value 2299.495305
## iter 10 value 1001.234496
## iter 20 value 801.874217
## iter 30 value 704.508557
## iter 40 value 646.499906
## iter 50 value 592.222928
## iter 60 value 515.198358
## iter 70 value 481.942953
## iter 80 value 465.234256
## iter 90 value 459.485341
## iter 100 value 455.718763
## final value 455.718763
## stopped after 100 iterations
## # weights: 63
## initial value 2735.216633
## iter 10 value 1170.140506
## iter 20 value 1028.129860
## iter 30 value 997.984015
## iter 40 value 987.060912
## iter 50 value 986.016167
## iter 60 value 984.473988
## iter 70 value 984.180493
## iter 80 value 984.032451
## final value 983.959763
## converged
## # weights: 187
## initial value 2064.147501
## iter 10 value 1269.432254
## iter 20 value 1148.619811
## iter 30 value 967.923917
## iter 40 value 891.790549
## iter 50 value 841.202663
## iter 60 value 802.050052
## iter 70 value 775.528895
## iter 80 value 742.249540
## iter 90 value 691.279323
## iter 100 value 664.739475
## final value 664.739475
## stopped after 100 iterations
## # weights: 311
## initial value 1582.216870
## iter 10 value 934.004520
## iter 20 value 720.251774
## iter 30 value 602.874610
## iter 40 value 538.482333
## iter 50 value 493.858449
## iter 60 value 469.152776
## iter 70 value 453.236116
## iter 80 value 433.788090
## iter 90 value 413.132196
## iter 100 value 398.517667
## final value 398.517667
## stopped after 100 iterations
## # weights: 63
## initial value 2412.083490
## iter 10 value 1119.487478
## iter 20 value 993.833100
## iter 30 value 944.182920
## iter 40 value 917.346929
## iter 50 value 895.979083
## iter 60 value 859.869168
## iter 70 value 854.636025
## iter 80 value 854.099974
## iter 90 value 852.174554
## iter 100 value 850.845916
## final value 850.845916
## stopped after 100 iterations
## # weights: 187
## initial value 1503.662229
## iter 10 value 956.067509
## iter 20 value 817.478174
## iter 30 value 740.497213
## iter 40 value 686.367298
## iter 50 value 654.068442
## iter 60 value 639.356774
## iter 70 value 632.765689
## iter 80 value 631.160283
## iter 90 value 629.821264
## iter 100 value 629.394787
## final value 629.394787
## stopped after 100 iterations
## # weights: 311
## initial value 3320.661393
## iter 10 value 1359.349265
## iter 20 value 1218.992199
## iter 30 value 1035.086449
## iter 40 value 868.214616
## iter 50 value 717.659909
## iter 60 value 609.338560
## iter 70 value 561.325556
## iter 80 value 540.353643
## iter 90 value 517.207027
## iter 100 value 496.047921
## final value 496.047921
## stopped after 100 iterations
## # weights: 63
## initial value 2796.000630
## iter 10 value 984.382315
## iter 20 value 891.245433
## iter 30 value 857.542019
## iter 40 value 837.207030
## iter 50 value 801.605533
## iter 60 value 765.438340
## iter 70 value 764.160128
## final value 764.158814
## converged
## # weights: 187
## initial value 2305.161801
## iter 10 value 1312.450796
## iter 20 value 1217.717473
## iter 30 value 1146.275214
## iter 40 value 1108.111849
## iter 50 value 1016.301622
## iter 60 value 985.189918
## iter 70 value 946.205736
## iter 80 value 875.843618
## iter 90 value 799.311685
## iter 100 value 777.118108
## final value 777.118108
## stopped after 100 iterations
## # weights: 311
## initial value 2098.363447
## iter 10 value 1086.851051
## iter 20 value 863.942784
## iter 30 value 714.153097
## iter 40 value 612.520788
## iter 50 value 558.033821
## iter 60 value 533.114499
## iter 70 value 512.711450
## iter 80 value 505.997632
## iter 90 value 502.294605
## iter 100 value 490.826291
## final value 490.826291
## stopped after 100 iterations
## # weights: 63
## initial value 1996.199272
## iter 10 value 1254.881410
## iter 20 value 1052.537550
## iter 30 value 949.337412
## iter 40 value 927.668856
## iter 50 value 915.248476
## iter 60 value 911.003555
## iter 70 value 909.386041
## iter 80 value 908.212276
## iter 90 value 906.883337
## iter 100 value 905.581678
## final value 905.581678
## stopped after 100 iterations
## # weights: 187
## initial value 2063.348109
## iter 10 value 1144.721798
## iter 20 value 997.098189
## iter 30 value 879.663916
## iter 40 value 816.896472
## iter 50 value 736.412444
## iter 60 value 688.972743
## iter 70 value 665.353019
## iter 80 value 654.004438
## iter 90 value 648.023443
## iter 100 value 640.733863
## final value 640.733863
## stopped after 100 iterations
## # weights: 311
## initial value 2732.522345
## iter 10 value 1191.291674
## iter 20 value 1029.711816
## iter 30 value 943.525469
## iter 40 value 857.698464
## iter 50 value 781.100656
## iter 60 value 721.632170
## iter 70 value 630.532678
## iter 80 value 542.696947
## iter 90 value 471.624510
## iter 100 value 438.597168
## final value 438.597168
## stopped after 100 iterations
## # weights: 63
## initial value 1833.571241
## iter 10 value 1201.936184
## iter 20 value 1127.349040
## iter 30 value 991.548529
## iter 40 value 921.064724
## iter 50 value 900.915553
## iter 60 value 892.993372
## iter 70 value 890.488275
## iter 80 value 864.449701
## iter 90 value 807.950158
## iter 100 value 800.239117
## final value 800.239117
## stopped after 100 iterations
## # weights: 187
## initial value 1811.151839
## iter 10 value 1031.272631
## iter 20 value 815.760935
## iter 30 value 711.199394
## iter 40 value 649.197949
## iter 50 value 620.152371
## iter 60 value 608.684563
## iter 70 value 599.845801
## iter 80 value 590.288276
## iter 90 value 587.022055
## iter 100 value 586.339680
## final value 586.339680
## stopped after 100 iterations
## # weights: 311
## initial value 1976.340490
## iter 10 value 1071.921435
## iter 20 value 856.959019
## iter 30 value 742.703566
## iter 40 value 654.909073
## iter 50 value 614.826029
## iter 60 value 579.128345
## iter 70 value 556.210475
## iter 80 value 544.097152
## iter 90 value 531.068169
## iter 100 value 526.640035
## final value 526.640035
## stopped after 100 iterations
## # weights: 63
## initial value 2391.914003
## iter 10 value 1057.586783
## iter 20 value 964.924917
## iter 30 value 935.966918
## iter 40 value 914.009133
## iter 50 value 879.089998
## iter 60 value 876.195250
## iter 70 value 876.187280
## final value 876.187265
## converged
## # weights: 187
## initial value 2217.104448
## iter 10 value 1145.597870
## iter 20 value 976.358292
## iter 30 value 841.693862
## iter 40 value 743.211723
## iter 50 value 705.398309
## iter 60 value 683.935105
## iter 70 value 663.293662
## iter 80 value 631.164507
## iter 90 value 624.397273
## iter 100 value 620.890018
## final value 620.890018
## stopped after 100 iterations
## # weights: 311
## initial value 1710.109736
## iter 10 value 1038.475707
## iter 20 value 778.910827
## iter 30 value 678.811259
## iter 40 value 629.944188
## iter 50 value 598.431899
## iter 60 value 563.893864
## iter 70 value 544.262450
## iter 80 value 525.779758
## iter 90 value 516.864951
## iter 100 value 514.938261
## final value 514.938261
## stopped after 100 iterations
## # weights: 63
## initial value 2188.412145
## iter 10 value 1179.159014
## iter 20 value 1056.231719
## iter 30 value 1015.312607
## iter 40 value 995.377267
## iter 50 value 972.623063
## iter 60 value 946.877293
## iter 70 value 942.583399
## iter 80 value 942.083170
## iter 90 value 942.052801
## final value 942.052663
## converged
## # weights: 187
## initial value 2962.841094
## iter 10 value 1188.306320
## iter 20 value 1048.687582
## iter 30 value 983.733110
## iter 40 value 939.119145
## iter 50 value 906.163356
## iter 60 value 865.594378
## iter 70 value 795.501623
## iter 80 value 742.679837
## iter 90 value 717.460057
## iter 100 value 702.154091
## final value 702.154091
## stopped after 100 iterations
## # weights: 311
## initial value 1903.859440
## iter 10 value 1201.068135
## iter 20 value 994.900314
## iter 30 value 785.709917
## iter 40 value 661.773806
## iter 50 value 582.322656
## iter 60 value 520.947364
## iter 70 value 480.914175
## iter 80 value 450.441759
## iter 90 value 424.081155
## iter 100 value 411.430683
## final value 411.430683
## stopped after 100 iterations
## # weights: 63
## initial value 1644.247385
## iter 10 value 1065.422642
## iter 20 value 984.227520
## iter 30 value 948.645783
## iter 40 value 934.116433
## iter 50 value 897.358297
## iter 60 value 891.940006
## iter 70 value 891.501528
## iter 80 value 891.062852
## iter 90 value 890.754341
## iter 100 value 890.407591
## final value 890.407591
## stopped after 100 iterations
## # weights: 187
## initial value 2889.395777
## iter 10 value 1248.223702
## iter 20 value 1028.016897
## iter 30 value 855.425179
## iter 40 value 771.653416
## iter 50 value 709.959898
## iter 60 value 681.156095
## iter 70 value 667.636251
## iter 80 value 652.936926
## iter 90 value 649.209314
## iter 100 value 647.296569
## final value 647.296569
## stopped after 100 iterations
## # weights: 311
## initial value 2007.323765
## iter 10 value 1081.401311
## iter 20 value 889.570320
## iter 30 value 750.215137
## iter 40 value 663.587914
## iter 50 value 542.075651
## iter 60 value 506.600950
## iter 70 value 470.889743
## iter 80 value 451.541646
## iter 90 value 446.457062
## iter 100 value 443.159197
## final value 443.159197
## stopped after 100 iterations
## # weights: 63
## initial value 2669.633915
## iter 10 value 1045.323717
## iter 20 value 923.632749
## iter 30 value 892.474108
## iter 40 value 878.561741
## iter 50 value 835.027982
## iter 60 value 826.716789
## iter 70 value 825.696074
## iter 80 value 825.687568
## iter 90 value 825.685398
## iter 100 value 825.684601
## final value 825.684601
## stopped after 100 iterations
## # weights: 187
## initial value 2815.882043
## iter 10 value 1176.152588
## iter 20 value 882.019176
## iter 30 value 746.250284
## iter 40 value 687.577929
## iter 50 value 652.812089
## iter 60 value 629.834592
## iter 70 value 616.447642
## iter 80 value 608.836698
## iter 90 value 604.857015
## iter 100 value 600.691164
## final value 600.691164
## stopped after 100 iterations
## # weights: 311
## initial value 1571.333614
## iter 10 value 800.053442
## iter 20 value 519.102289
## iter 30 value 402.679306
## iter 40 value 352.120420
## iter 50 value 327.057807
## iter 60 value 317.273318
## iter 70 value 307.800591
## iter 80 value 298.094076
## iter 90 value 286.967660
## iter 100 value 280.821191
## final value 280.821191
## stopped after 100 iterations
## # weights: 63
## initial value 2715.212372
## iter 10 value 1114.582972
## iter 20 value 1040.168483
## iter 30 value 997.350947
## iter 40 value 985.566079
## iter 50 value 966.077319
## iter 60 value 944.951782
## iter 70 value 931.426650
## iter 80 value 921.152355
## iter 90 value 919.322052
## iter 100 value 918.357583
## final value 918.357583
## stopped after 100 iterations
## # weights: 187
## initial value 2286.441983
## iter 10 value 1132.100554
## iter 20 value 961.138236
## iter 30 value 852.289455
## iter 40 value 788.076761
## iter 50 value 753.251892
## iter 60 value 735.848920
## iter 70 value 725.201568
## iter 80 value 713.594920
## iter 90 value 707.271733
## iter 100 value 700.203783
## final value 700.203783
## stopped after 100 iterations
## # weights: 311
## initial value 3504.045679
## iter 10 value 1198.766422
## iter 20 value 1099.290279
## iter 30 value 1028.348773
## iter 40 value 928.783657
## iter 50 value 851.462584
## iter 60 value 802.758515
## iter 70 value 753.981723
## iter 80 value 685.074970
## iter 90 value 627.925047
## iter 100 value 584.325961
## final value 584.325961
## stopped after 100 iterations
## # weights: 63
## initial value 3265.673978
## iter 10 value 1259.795185
## iter 20 value 1088.956383
## iter 30 value 1014.041697
## iter 40 value 1003.558400
## iter 50 value 995.588441
## iter 60 value 976.457422
## iter 70 value 975.485314
## iter 80 value 975.343658
## iter 90 value 974.258670
## iter 100 value 974.109378
## final value 974.109378
## stopped after 100 iterations
## # weights: 187
## initial value 2619.794975
## iter 10 value 1221.637379
## iter 20 value 979.160501
## iter 30 value 913.417035
## iter 40 value 868.359481
## iter 50 value 846.574499
## iter 60 value 834.544394
## iter 70 value 814.175544
## iter 80 value 790.132414
## iter 90 value 767.355435
## iter 100 value 754.970314
## final value 754.970314
## stopped after 100 iterations
## # weights: 311
## initial value 2817.140649
## iter 10 value 1206.302894
## iter 20 value 957.983688
## iter 30 value 801.380772
## iter 40 value 694.243740
## iter 50 value 611.981187
## iter 60 value 584.518724
## iter 70 value 566.404641
## iter 80 value 558.687054
## iter 90 value 550.974524
## iter 100 value 546.313643
## final value 546.313643
## stopped after 100 iterations
## # weights: 63
## initial value 2256.258533
## iter 10 value 1138.314027
## iter 20 value 1009.103714
## iter 30 value 964.653495
## iter 40 value 950.135756
## iter 50 value 925.212168
## iter 60 value 908.565327
## iter 70 value 908.405691
## final value 908.404535
## converged
## # weights: 187
## initial value 2794.339136
## iter 10 value 1195.252929
## iter 20 value 1023.684041
## iter 30 value 894.993467
## iter 40 value 829.922099
## iter 50 value 706.634354
## iter 60 value 659.898031
## iter 70 value 634.868862
## iter 80 value 621.504113
## iter 90 value 607.768872
## iter 100 value 603.387585
## final value 603.387585
## stopped after 100 iterations
## # weights: 311
## initial value 2414.145256
## iter 10 value 1303.496581
## iter 20 value 1081.547603
## iter 30 value 863.012286
## iter 40 value 749.818098
## iter 50 value 670.294460
## iter 60 value 635.159761
## iter 70 value 584.556633
## iter 80 value 552.463227
## iter 90 value 540.607628
## iter 100 value 524.353628
## final value 524.353628
## stopped after 100 iterations
## # weights: 63
## initial value 1738.332492
## iter 10 value 1119.108214
## iter 20 value 1046.390241
## iter 30 value 1028.917658
## iter 40 value 1024.067670
## iter 50 value 1016.454530
## iter 60 value 1005.737150
## iter 70 value 990.837794
## iter 80 value 973.773701
## iter 90 value 968.366015
## iter 100 value 967.517727
## final value 967.517727
## stopped after 100 iterations
## # weights: 187
## initial value 1681.884134
## iter 10 value 1103.436497
## iter 20 value 954.844302
## iter 30 value 874.188818
## iter 40 value 825.383704
## iter 50 value 788.378377
## iter 60 value 759.290097
## iter 70 value 741.845593
## iter 80 value 727.865045
## iter 90 value 712.469503
## iter 100 value 698.915944
## final value 698.915944
## stopped after 100 iterations
## # weights: 311
## initial value 1516.048856
## iter 10 value 1078.778557
## iter 20 value 915.435438
## iter 30 value 811.406885
## iter 40 value 738.360427
## iter 50 value 657.707929
## iter 60 value 599.403861
## iter 70 value 560.685921
## iter 80 value 536.768255
## iter 90 value 519.482567
## iter 100 value 493.875003
## final value 493.875003
## stopped after 100 iterations
## # weights: 63
## initial value 1631.259362
## iter 10 value 1203.623554
## iter 20 value 1092.637151
## iter 30 value 1018.096620
## iter 40 value 987.787624
## iter 50 value 970.585858
## iter 60 value 946.394673
## iter 70 value 943.714234
## iter 80 value 942.684535
## iter 90 value 942.406230
## iter 100 value 942.212290
## final value 942.212290
## stopped after 100 iterations
## # weights: 187
## initial value 2562.907885
## iter 10 value 1197.533563
## iter 20 value 990.846230
## iter 30 value 853.964122
## iter 40 value 751.102885
## iter 50 value 703.801095
## iter 60 value 694.832619
## iter 70 value 682.728953
## iter 80 value 674.581628
## iter 90 value 669.265970
## iter 100 value 663.732508
## final value 663.732508
## stopped after 100 iterations
## # weights: 311
## initial value 2297.098826
## iter 10 value 1188.099722
## iter 20 value 950.479508
## iter 30 value 808.978632
## iter 40 value 690.380825
## iter 50 value 558.334184
## iter 60 value 489.615883
## iter 70 value 466.342653
## iter 80 value 447.020837
## iter 90 value 425.604549
## iter 100 value 405.806609
## final value 405.806609
## stopped after 100 iterations
## # weights: 63
## initial value 2287.917914
## iter 10 value 1178.719838
## iter 20 value 1073.146366
## iter 30 value 993.616250
## iter 40 value 972.063405
## iter 50 value 936.426010
## iter 60 value 921.754434
## iter 70 value 921.119928
## iter 80 value 921.114799
## iter 90 value 921.111429
## iter 100 value 921.110364
## final value 921.110364
## stopped after 100 iterations
## # weights: 187
## initial value 1887.728477
## iter 10 value 1237.106993
## iter 20 value 999.661506
## iter 30 value 872.108630
## iter 40 value 810.343667
## iter 50 value 753.876401
## iter 60 value 692.016370
## iter 70 value 658.902503
## iter 80 value 627.180328
## iter 90 value 614.073796
## iter 100 value 613.382145
## final value 613.382145
## stopped after 100 iterations
## # weights: 311
## initial value 2434.449624
## iter 10 value 1015.948514
## iter 20 value 783.324441
## iter 30 value 562.311706
## iter 40 value 449.959829
## iter 50 value 387.675896
## iter 60 value 362.793409
## iter 70 value 349.239752
## iter 80 value 335.752152
## iter 90 value 330.453891
## iter 100 value 325.552687
## final value 325.552687
## stopped after 100 iterations
## # weights: 63
## initial value 1888.058432
## iter 10 value 1139.940377
## iter 20 value 1029.971943
## iter 30 value 983.701615
## iter 40 value 954.523171
## iter 50 value 943.812827
## iter 60 value 934.130974
## iter 70 value 922.152124
## iter 80 value 915.961388
## iter 90 value 909.823296
## iter 100 value 906.910416
## final value 906.910416
## stopped after 100 iterations
## # weights: 187
## initial value 1805.570588
## iter 10 value 1137.918169
## iter 20 value 966.117604
## iter 30 value 898.514382
## iter 40 value 861.092523
## iter 50 value 839.098452
## iter 60 value 823.658051
## iter 70 value 798.888592
## iter 80 value 780.591663
## iter 90 value 753.453113
## iter 100 value 728.185696
## final value 728.185696
## stopped after 100 iterations
## # weights: 311
## initial value 1670.443885
## iter 10 value 959.729442
## iter 20 value 811.543399
## iter 30 value 724.903963
## iter 40 value 635.784206
## iter 50 value 561.932768
## iter 60 value 512.568920
## iter 70 value 466.427525
## iter 80 value 423.966590
## iter 90 value 394.396412
## iter 100 value 370.012023
## final value 370.012023
## stopped after 100 iterations
## # weights: 63
## initial value 1896.041165
## iter 10 value 1043.461141
## iter 20 value 937.801641
## iter 30 value 899.971883
## iter 40 value 868.480121
## iter 50 value 836.000883
## iter 60 value 835.221918
## iter 70 value 835.004831
## iter 80 value 834.846375
## iter 90 value 834.787057
## iter 100 value 834.733044
## final value 834.733044
## stopped after 100 iterations
## # weights: 187
## initial value 3192.012153
## iter 10 value 1265.383827
## iter 20 value 997.727910
## iter 30 value 898.312784
## iter 40 value 841.775139
## iter 50 value 785.210573
## iter 60 value 750.958995
## iter 70 value 722.263326
## iter 80 value 706.551232
## iter 90 value 695.919635
## iter 100 value 692.794023
## final value 692.794023
## stopped after 100 iterations
## # weights: 311
## initial value 2915.001705
## iter 10 value 1352.565298
## iter 20 value 1276.115737
## iter 30 value 1194.543170
## iter 40 value 1134.245261
## iter 50 value 1097.420713
## iter 60 value 1075.086438
## iter 70 value 1064.851014
## iter 80 value 1055.110211
## iter 90 value 1047.172517
## iter 100 value 1042.003636
## final value 1042.003636
## stopped after 100 iterations
## # weights: 63
## initial value 1989.502397
## iter 10 value 1019.058451
## iter 20 value 945.330671
## iter 30 value 919.274365
## iter 40 value 902.179691
## iter 50 value 865.583387
## iter 60 value 860.563828
## iter 70 value 860.545898
## final value 860.545868
## converged
## # weights: 187
## initial value 2510.683905
## iter 10 value 1098.037498
## iter 20 value 908.356899
## iter 30 value 786.630147
## iter 40 value 711.282161
## iter 50 value 681.934961
## iter 60 value 666.797909
## iter 70 value 646.545725
## iter 80 value 634.737250
## iter 90 value 632.747959
## iter 100 value 632.505221
## final value 632.505221
## stopped after 100 iterations
## # weights: 311
## initial value 2389.256006
## iter 10 value 1185.504119
## iter 20 value 972.839475
## iter 30 value 766.641324
## iter 40 value 616.982575
## iter 50 value 546.622123
## iter 60 value 497.541776
## iter 70 value 470.860499
## iter 80 value 459.274461
## iter 90 value 452.669063
## iter 100 value 443.910889
## final value 443.910889
## stopped after 100 iterations
## # weights: 63
## initial value 1582.234335
## iter 10 value 1080.011461
## iter 20 value 1000.168055
## iter 30 value 978.085965
## iter 40 value 969.536176
## iter 50 value 965.140623
## iter 60 value 964.149353
## iter 70 value 963.628205
## iter 80 value 963.495677
## iter 90 value 963.477192
## final value 963.476839
## converged
## # weights: 187
## initial value 1697.294632
## iter 10 value 1085.409383
## iter 20 value 906.017386
## iter 30 value 824.306679
## iter 40 value 787.106452
## iter 50 value 766.573760
## iter 60 value 755.931829
## iter 70 value 744.875585
## iter 80 value 732.073425
## iter 90 value 706.861507
## iter 100 value 693.416272
## final value 693.416272
## stopped after 100 iterations
## # weights: 311
## initial value 2234.170833
## iter 10 value 1462.414987
## iter 20 value 1206.317929
## iter 30 value 1051.959629
## iter 40 value 871.097542
## iter 50 value 723.961864
## iter 60 value 628.722400
## iter 70 value 561.957055
## iter 80 value 520.274689
## iter 90 value 494.146833
## iter 100 value 457.939826
## final value 457.939826
## stopped after 100 iterations
## # weights: 63
## initial value 2048.260765
## iter 10 value 1079.023703
## iter 20 value 991.426774
## iter 30 value 969.472445
## iter 40 value 959.453713
## iter 50 value 951.607458
## iter 60 value 947.969431
## iter 70 value 943.501857
## iter 80 value 942.702520
## iter 90 value 942.447400
## iter 100 value 942.412007
## final value 942.412007
## stopped after 100 iterations
## # weights: 187
## initial value 1780.458259
## iter 10 value 989.896025
## iter 20 value 863.339947
## iter 30 value 706.175348
## iter 40 value 654.019374
## iter 50 value 615.250717
## iter 60 value 586.149398
## iter 70 value 569.172765
## iter 80 value 559.435690
## iter 90 value 553.066094
## iter 100 value 550.722991
## final value 550.722991
## stopped after 100 iterations
## # weights: 311
## initial value 3262.668760
## iter 10 value 1271.586476
## iter 20 value 1010.854542
## iter 30 value 873.455442
## iter 40 value 811.689370
## iter 50 value 776.511201
## iter 60 value 743.863993
## iter 70 value 723.683714
## iter 80 value 702.525008
## iter 90 value 684.080752
## iter 100 value 668.862849
## final value 668.862849
## stopped after 100 iterations
## # weights: 63
## initial value 3096.204129
## iter 10 value 1300.547593
## iter 20 value 1139.205351
## iter 30 value 1086.285003
## iter 40 value 1061.142407
## iter 50 value 1040.512142
## iter 60 value 1022.186523
## iter 70 value 1019.495429
## iter 80 value 1019.438697
## iter 90 value 1019.428370
## iter 100 value 1019.425163
## final value 1019.425163
## stopped after 100 iterations
## # weights: 187
## initial value 1899.848874
## iter 10 value 1192.865164
## iter 20 value 944.511812
## iter 30 value 861.216424
## iter 40 value 770.641085
## iter 50 value 691.832876
## iter 60 value 644.885705
## iter 70 value 622.910848
## iter 80 value 601.550477
## iter 90 value 578.469846
## iter 100 value 570.371597
## final value 570.371597
## stopped after 100 iterations
## # weights: 311
## initial value 1707.933671
## iter 10 value 1112.444627
## iter 20 value 756.532685
## iter 30 value 577.739868
## iter 40 value 492.339155
## iter 50 value 434.110334
## iter 60 value 396.922610
## iter 70 value 356.406752
## iter 80 value 322.483519
## iter 90 value 282.559356
## iter 100 value 279.579614
## final value 279.579614
## stopped after 100 iterations
## # weights: 63
## initial value 3117.583563
## iter 10 value 1309.585555
## iter 20 value 1192.821578
## iter 30 value 1098.288834
## iter 40 value 1066.827642
## iter 50 value 1046.548173
## iter 60 value 1037.584337
## iter 70 value 1032.599727
## iter 80 value 1011.284542
## iter 90 value 1003.832415
## iter 100 value 1000.177696
## final value 1000.177696
## stopped after 100 iterations
## # weights: 187
## initial value 1566.242462
## iter 10 value 1051.441317
## iter 20 value 939.239892
## iter 30 value 798.438120
## iter 40 value 729.939877
## iter 50 value 674.144821
## iter 60 value 637.809684
## iter 70 value 624.701116
## iter 80 value 617.190248
## iter 90 value 608.478813
## iter 100 value 599.529276
## final value 599.529276
## stopped after 100 iterations
## # weights: 311
## initial value 1752.949728
## iter 10 value 1121.514791
## iter 20 value 909.675072
## iter 30 value 742.724427
## iter 40 value 636.478266
## iter 50 value 565.643588
## iter 60 value 505.332411
## iter 70 value 469.397544
## iter 80 value 441.884781
## iter 90 value 422.113300
## iter 100 value 410.590636
## final value 410.590636
## stopped after 100 iterations
## # weights: 63
## initial value 2889.872836
## iter 10 value 1157.744455
## iter 20 value 1041.877384
## iter 30 value 980.543505
## iter 40 value 958.361301
## iter 50 value 931.153860
## iter 60 value 912.184447
## iter 70 value 909.477861
## iter 80 value 908.822066
## iter 90 value 908.061549
## iter 100 value 907.661284
## final value 907.661284
## stopped after 100 iterations
## # weights: 187
## initial value 2319.885491
## iter 10 value 1191.696046
## iter 20 value 1004.949880
## iter 30 value 876.594011
## iter 40 value 784.046954
## iter 50 value 698.331401
## iter 60 value 652.430099
## iter 70 value 628.585970
## iter 80 value 611.495974
## iter 90 value 602.867550
## iter 100 value 600.386799
## final value 600.386799
## stopped after 100 iterations
## # weights: 311
## initial value 2056.908973
## iter 10 value 1022.840558
## iter 20 value 709.132865
## iter 30 value 591.031826
## iter 40 value 512.061306
## iter 50 value 460.232634
## iter 60 value 438.840263
## iter 70 value 417.385573
## iter 80 value 409.607741
## iter 90 value 405.869126
## iter 100 value 402.609134
## final value 402.609134
## stopped after 100 iterations
## # weights: 63
## initial value 1968.650654
## iter 10 value 1233.958941
## iter 20 value 1104.812973
## iter 30 value 1069.027583
## iter 40 value 1041.719817
## iter 50 value 1017.454836
## iter 60 value 987.377545
## iter 70 value 981.536005
## iter 80 value 981.504832
## final value 981.504674
## converged
## # weights: 187
## initial value 1545.952213
## iter 10 value 1019.569596
## iter 20 value 852.079496
## iter 30 value 738.004909
## iter 40 value 701.723750
## iter 50 value 676.788668
## iter 60 value 659.549556
## iter 70 value 630.253015
## iter 80 value 622.560707
## iter 90 value 622.249917
## iter 100 value 622.059324
## final value 622.059324
## stopped after 100 iterations
## # weights: 311
## initial value 1671.403967
## iter 10 value 995.804342
## iter 20 value 752.563482
## iter 30 value 625.282865
## iter 40 value 562.681128
## iter 50 value 524.765423
## iter 60 value 492.641494
## iter 70 value 461.851689
## iter 80 value 449.211067
## iter 90 value 441.697439
## iter 100 value 437.740061
## final value 437.740061
## stopped after 100 iterations
## # weights: 63
## initial value 3079.906075
## iter 10 value 1333.629846
## iter 20 value 1181.119888
## iter 30 value 1056.936047
## iter 40 value 1023.043504
## iter 50 value 999.149610
## iter 60 value 994.870152
## iter 70 value 992.400607
## iter 80 value 988.709330
## iter 90 value 986.690908
## iter 100 value 986.616902
## final value 986.616902
## stopped after 100 iterations
## # weights: 187
## initial value 2847.399015
## iter 10 value 1339.194203
## iter 20 value 1207.083566
## iter 30 value 1074.942130
## iter 40 value 966.023509
## iter 50 value 904.837456
## iter 60 value 863.351824
## iter 70 value 813.476122
## iter 80 value 780.284781
## iter 90 value 745.970730
## iter 100 value 733.638362
## final value 733.638362
## stopped after 100 iterations
## # weights: 311
## initial value 1494.503220
## iter 10 value 1045.642008
## iter 20 value 784.032792
## iter 30 value 616.912745
## iter 40 value 549.065481
## iter 50 value 502.804771
## iter 60 value 464.510908
## iter 70 value 430.746635
## iter 80 value 404.261990
## iter 90 value 384.644972
## iter 100 value 361.948859
## final value 361.948859
## stopped after 100 iterations
## # weights: 63
## initial value 2066.642411
## iter 10 value 1171.548932
## iter 20 value 1069.328312
## iter 30 value 1033.841355
## iter 40 value 1004.830963
## iter 50 value 983.278058
## iter 60 value 958.525263
## iter 70 value 956.395451
## iter 80 value 955.205351
## iter 90 value 954.289743
## iter 100 value 953.657047
## final value 953.657047
## stopped after 100 iterations
## # weights: 187
## initial value 3895.449146
## iter 10 value 1358.717275
## iter 20 value 1115.546544
## iter 30 value 951.491095
## iter 40 value 830.214277
## iter 50 value 741.474888
## iter 60 value 697.102253
## iter 70 value 665.692488
## iter 80 value 649.173169
## iter 90 value 636.756169
## iter 100 value 627.242266
## final value 627.242266
## stopped after 100 iterations
## # weights: 311
## initial value 2406.993020
## iter 10 value 1051.599065
## iter 20 value 799.771303
## iter 30 value 647.237784
## iter 40 value 549.144452
## iter 50 value 465.809609
## iter 60 value 424.824975
## iter 70 value 400.164478
## iter 80 value 385.391076
## iter 90 value 377.850331
## iter 100 value 369.286765
## final value 369.286765
## stopped after 100 iterations
## # weights: 311
## initial value 3955.553178
## iter 10 value 1314.508491
## iter 20 value 1188.522661
## iter 30 value 1095.376334
## iter 40 value 1008.613244
## iter 50 value 934.788501
## iter 60 value 851.783121
## iter 70 value 808.574226
## iter 80 value 763.308422
## iter 90 value 721.393968
## iter 100 value 698.621884
## final value 698.621884
## stopped after 100 iterations
dNN_test$score = predict(nn_mod, newdata=dNN_test)
perf_met(dNN_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 1020 76
## Left 89 137
##
## Accuracy : 0.8752
## 95% CI : (0.8562, 0.8925)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : 0.0001244
##
## Kappa : 0.5494
##
## Mcnemar's Test P-Value : 0.3502014
##
## Sensitivity : 0.6432
## Specificity : 0.9197
## Pos Pred Value : 0.6062
## Neg Pred Value : 0.9307
## Prevalence : 0.1611
## Detection Rate : 0.1036
## Detection Prevalence : 0.1710
## Balanced Accuracy : 0.7815
##
## 'Positive' Class : Left
##
feature_imp(nn_mod$finalModel)
Insight
weights = ifelse(dNN_train$Attrition == 'Left', 0.84, 0.16)
fitControl <- trainControl(method = "repeatedcv",
number = 5,
repeats = 3,
returnResamp="all",
savePredictions = TRUE,
classProbs = TRUE,
summaryFunction = twoClassSummary)
paramGrid <- expand.grid(size = c(3, 6, 9, 12, 15), decay = c(1.0, 0.5, 0.1))
set.seed(1234)
nn_mod_1<- train(Attrition ~
Age +
BusinessTravel +
Department +
DistanceFromHome +
Education +
EducationField +
Gender +
JobLevel +
JobRole +
MaritalStatus +
MonthlyIncome +
NumCompaniesWorked +
PercentSalaryHike +
StockOptionLevel +
TotalWorkingYears +
TrainingTimesLastYear +
YearsAtCompany +
YearsSinceLastPromotion +
YearsWithCurrManager +
EnvironmentSatisfaction +
JobSatisfaction +
WorkLifeBalance +
JobInvolvement +
PerformanceRating +
AvgHrs,
data = dNN_train,
method = "nnet", # Neural network model
trControl = fitControl,
tuneGrid = paramGrid,
weights = weights,
trace = FALSE,
metric="Sens")
plot(nn_mod_1)
varImp(nn_mod_1)
## nnet variable importance
##
## only 20 most important variables shown (out of 60)
##
## Overall
## GenderMale 100.00
## AvgHrs 95.51
## TrainingTimesLastYear 78.01
## YearsWithCurrManager 74.30
## JobRoleLaboratory Technician 69.33
## DistanceFromHome 68.67
## StockOptionLevel.C 56.44
## BusinessTravel.Q 54.63
## JobLevel^4 51.38
## Education^4 49.12
## NumCompaniesWorked 48.65
## WorkLifeBalance^4 47.91
## DepartmentResearch & Development 47.04
## YearsAtCompany 46.96
## JobRoleSales Representative 46.74
## JobRoleResearch Scientist 46.72
## JobInvolvement.Q 46.23
## JobSatisfaction.C 45.84
## TotalWorkingYears 44.48
## EnvironmentSatisfaction.Q 43.04
dNN_test$score = predict(nn_mod_1, newdata=dNN_test)
perf_met(dNN_test)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Stayed Left
## Stayed 1084 5
## Left 25 208
##
## Accuracy : 0.9773
## 95% CI : (0.9678, 0.9846)
## No Information Rate : 0.8389
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.9191
##
## Mcnemar's Test P-Value : 0.0005226
##
## Sensitivity : 0.9765
## Specificity : 0.9775
## Pos Pred Value : 0.8927
## Neg Pred Value : 0.9954
## Prevalence : 0.1611
## Detection Rate : 0.1573
## Detection Prevalence : 0.1762
## Balanced Accuracy : 0.9770
##
## 'Positive' Class : Left
##
feature_imp(nn_mod_1$finalModel)
The best performing model utilized the Random Forest algorithm with mtry=11 and ntree=501.
Key driving features were: * AvgHrs * YearsAtCompany * TotalWorkngYears * MaritalStatus * Age
Of these features, the only feature which the company can affect for the current employees is the average work hours. Employees who work longer hours tend to leave the company at a higher rate. Therefore, limiting or encouraging work hours to be regular 8 hour shifts may increase employee retention rate.